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Abstract 

Once the set of finite graphs is equipped with an algebra structure 
(arising from the definition of operations that generalize the concatena- 
tion of words) , one can define the notion of a recognizable set of graphs in 
terms of finite congruences. Applications to the construction of efficient 
algorithms and to the theory of context-free sets of graphs follow natu- 
rally. The class of recognizable sets depends on the signature of graph 
operations. We consider three signatures related respectively to Hyper- 
edge Replacement (HR) context-free graph grammars, to Vertex Replace- 
ment (VR) context-free graph grammars, and to modular decompositions 
of graphs. We compare the corresponding classes of recognizable sets. 
We show that they are robust in the sense that many variants of each 
signature (where in particular operations are defined by quantifier- free 
formulas, a quite flexible framework) yield the same notions of recog- 
nizability. We prove that for graphs without large complete bipartite 
subgraphs, HR-recognizability and VR-recognizability coincide. The same 
combinatorial condition equates H R-context-free and VR-context-free sets 
of graphs. Inasmuch as possible, results are formulated in the more general 
framework of relational structures. 



1 Introduction 

The notion of a recognizable language is a fundamental concept in Formal Lan- 
guage Theory, which has been clearly identified since the 1950's. It is important 
because of its numerous applications, in particular for the construction of com- 
pilers, and also for the development of the Theory: indeed, these languages can 
be specified in several very different ways, by means of automata, congruences, 
regular expressions and logical formulas. This multiplicity of quite different def- 
initions is a clear indication that the notion is central since one arrives at it in a 
natural way from different approaches. The equivalence of definitions is proved 
in fundamental results by Kleene, Myhill and Nerode, Elgot and Biichi. 
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The notion of a recognizable set has been extended in the 1960's to trees 
(actually to trees representing finite algebraic terms), to infinite words and to 
infinite trees. In the present article we discuss its extension to sets of finite 
graphs. 

The rccognizability of a set of finite words or trees can be defined in several 
ways, as mentioned above, and in particular by finite deterministic automata. 
This definition (together with the related effective translations from other defi- 
nitions) provides linear-time recognition algorithms, which are essential for com- 
piler construction, coding, text processing, and in other situations. Recognizable 
sets of words can also be defined in an algebraic way by finite saturating con- 
gruences relative to the monoid structure. These definitions, by automata and 
congruences, extend smoothly to the case of finite trees (i.e., algebraic terms), 
using the natural algebra structure. The notion of recognizability in a general 
algebra is due to Mezei and Wright [SZj • We will not discuss here the extensions 
to infinite words and trees, which raise specific problems surveyed by Thomas 
and Perrin and Pin Our aim will be to consider sets of finite graphs. 

For finite graphs, there is no automaton model, except in very special cases, 
and in particular in the case of graphs representing certain labelled partially 
ordered sets and traces (a trace is a directed acyclic graph, representing the 
equivalence class of a word w.r.t. a partial commutation relation), see the 
volume edited by Diekert [22 and the papers by Lodaya and Weil [321 155] 
and Esik and Nemeth [21]. Algebraic definitions via finite congruences can 
be given because the set of finite graphs can be equipped with an algebraic 
structure, based on graph operations like the concatenation of words. However, 
many operations on graphs can be defined, and there is no prominent choice 
for a standard algebraic structure like in the case of words where a unique 
associative binary operation is sufficient. Several algebraic structures on graphs 
can be defined, and distinct notions of recognizability follow from these possible 
choices. It appears nevertheless that two graph algebras, called the HR-algebra 
and the VR-algebra for reasons explained below, emerge and provide robust 
notions of recognizability. The main purpose of this paper is to demonstrate 
the robustness of these notions. By robustness, we mean that taking variants of 
the basic definitions does not modify the corresponding classes of recognizable 
sets of graphs. 

In any algebra, one can define two family of sets, the recognizable sets and 
the equational sets. The equational sets are defined as the components of the 
least solutions of certain systems of recursive set equations, written with set 
union and the operations of the algebra, extended to sets in the standard way. 
Equational sets can be considered as the natural extension of context-free lan- 
guages in a general algebraic framework (Mezei and Wright [37], Courcelle [T2*] 
for a thorough development). The two graph algebras introduced above, the 
HR- and the \IR-algebra 1 are familiar to readers interested in graph grammars, 
because their equational sets are the (context-free) Hyperedge Replacement (HR) 
sets of graphs on the one hand, and the (context-free) Vertex Replacement (VR) 
sets on the other. Both classes of context-free sets of graphs can be defined in 
alternative, more complicated ways in terms of graph rewritings, and are robust 
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in the sense that they are closed under certain transformations expressible in 
Monadic Second-Order Logic (Courcelle [T5]1. 

The main results of this paper, described below in more detail, are: 

1) the robustness of the classes of VR- and H R-recognizable sets of graphs, 

2) the robustness of the class of recognizable sets of finite relational structures 
(equivalcntly of simple directed ranked hypergraphs), which extends the two 
previous classes, 

3) the exhibition of structural conditions on sets of graphs implying that 
HR-recognizability and VR-recognizability coincide, 

4) the comparison of the recognizable sets of the VR-algebra and those of 
a closely related algebra representing modular decompositions (modular decom- 
position is another useful notion for graph algorithms). 

The notion of recognizability of a set of finite graphs is important for sev- 
eral reasons. First, because recognizability yields linear-time algorithms for the 
verification of a wide class of graph properties on graphs belonging to certain 
finitely generated graph algebras. These classes consist of graphs of bounded 
tree-width and of bounded clique-width. These two notions of graph complex- 
ity are important for constructions of polynomial graph algorithms, sec Downey 
and Fellows [53] and Courcelle et al. (20|. Furthermore, these graph proper- 
ties are not very difficult to identify because Monadic second-order (MS) logic 
can specify them in a formalized and uniform way. (In many cases, an MS 
formula can be obtained from the graph theoretical expression of a property). 
More precisely, a central result [SJ 1151 [2*U| says that every set of graphs (or 
graph property) definable by an MS formula is recognizable (respectively admits 
such algorithms), for appropriate graph algebras. This general statement covers 
actually several distinct situations. 

Another reason comes from the theory of Graph Grammars. The intersec- 
tion of a context-free set of graphs and of a recognizable set is context-free (in 
the appropriate algebraic framework). This gives immediately many closure 
properties for context-free sets of graphs, via the use of MS logic as a specifi- 
cation language for graph properties. Recognizability also makes it possible to 
construct terminating and (in a certain sense) confluent graph rewriting rules by 
which one can recognize sets of graphs of bounded tree- width by graph reduction 
in linear time, see Arnborg et al. 

Finally, recognizability is a basic notion for dealing with languages and sets 
of terms, and on this ground, its extension to sets of graphs is worth investi- 
gating. Logical characterizations of recognizability can be given using MS logic, 
extending many results in language theory [161 1281 I2H 1291 13U) . Several questions 
remain open in this research field. 

Wc have noted above that defining recognizability for sets of graphs cannot 
be done in terms of finite automata, so that the algebraic definition in terms 
of finite congruences has no alternative. Another advantage of the algebraic 
definition is that it is given at the level of universal algebra (Mezei and Wright 
(23), and thus applies to objects other than graphs. However, even in the 
case of graphs, the algebraic setting is useful because it hides (temporarily) the 
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complexities of operations on graphs and makes it possible to understand what 
is going on at a structural level. 

We now present the main results of this article more in detail. The two 
main algebraic structures on graphs called VR and HR, originate from algebraic 
descriptions of context-free graph grammars. Definitions will be given in the 
body of the text. It is enough for this introduction to retain that the operations 
of VR are more powerful than those of HR. Hence every HR-context-free set of 
graphs (i.e., defined by a grammar based on the operations of HR) is VR-context- 
free, but not vice-versa. For recognizability, the inclusion goes in the opposite 
direction : every VR- recognizable set is H R-recognizable but the converse is not 
true. However, if the graphs of a set L have no subgraph of the form K n ^ n 
(the complete bipartite graph on n + n vertices) for some n, then L is HR- 
recognizable if and only if it is VR-recognizable (this is the main theorem of 
Sectional). A similar statement is known to hold under the same hypothesis for 
context-free sets: if L is without K n ^ n {i.e., no graph in L contains a subgraph 
isomorphic to K n ^ n ), then it is HR-context-free if and only if it is VR-contcxt- 
free (Courcelle, Jl]). The proofs of the two statements arc however different 
(and both difficult). 

Up to now we have only discussed graphs, but our approach, which extends 
the approach developped by Courcelle in [Sj , also works for hypergraphs and for 
relational structures. 

The operations on graphs, hypergraphs and structures are basically of three 
types defined in Section|21 we use only one binary operation, the disjoint union; 
we use unary operations defined by quantifier- free first-order formulas; and basic 
graphs and structures corresponding to miliary operations. In this way we can 
generate graphs and structures by finite algebraic terms. The quantifier-free 
definable operations can modify vertex and edge labels, add or delete edges. 
This notion is thus quite flexible. What is remarkable is that these numerous 
operations can be added without altering the notion of recognizability. 

The main result of S ect ion 0] states that the same recognizable sets of graphs 
are obtained if one uses the basic VR-algebra (closely connected to the definition 
of clique- width) , the same algebra enriched with quantifier- free definable oper- 
ations, and even the larger algebra dealing with relational structures. Variants 
of the VR-algebra which are useful, in particular for algorithmic applications, 
are also considered, and they are proved to yield the same class of recognizable 
sets. 

In Section we discuss similarly the H R-algebra which is very important 
because of its relation with tree-width and with context-free graph grammars. 
We prove a robustness result relative to the subclass such that the distinguished 
vertices denoted by distinct labels (miliary operations) are different. The HR- 
operations are appropriate to handle graphs and hypergraphs with multiple 
edges and hyperedges (whereas the VR-operations are not). The original defi- 
nitions (see Courcelle jSj) were given for graphs with multiple edges and hyper- 
edges. In Sectiond we prove that for a set of simple graphs, HR-recognizability 
is the same in the H R-algebra of simple graphs and in the larger H R-algebra of 
graphs with multiple edges. Without being extremely difficult, the proof is not 
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just a routine verification. 

In Section |SJ we consider an algebra arising from the theory of modular 
decomposition of graphs. We show that under a natural finitencss condition, 
the corresponding class of recognizable sets is equal to that of VR-recognizablc 
ones. 

In an appendix, we clarify the definitions of certain equivalences of logical 
formulas, focusing on cases where they are decidable, and we give upper bounds 
to the cardinalities of the quotient sets for these equivalences. These results 
yield upper bounds to the number of equivalence classes in logically based con- 
gruences. They are thus useful for the investigation of recognizability in view 
of the cases where the sets under consideration are defined by logical formulas. 
They also provide elements to appreciate (an upper bound of) the complexity 
of the algorithms underlying a number of the effective proofs in the main body 
of the paper. 

This work has been presented in invited lectures by B. Courcelle ^2] and P. 
Weil [27j. 

2 Recognizability 

The notion of a recognizable set is due to Mezei and Wright [S3 ■ It was origi- 
nally defined for one-sort structures, and we adapt it to many-sorted ones with 
infinitely many sorts. We begin with definitions concerning many-sorted alge- 
bras. 

2.1 Algebras 

We follow essentially the notation and definitions from see also Let 
S be a set called the set of sorts. An S- signature is a set T given with two 
mappings a:T — ► seq(S) (the set of finite sequences of elements of S), called 
the arity mapping, and a: T — > S, called the sort mapping. We denote by 
p(f) the length of the sequence a(f), which we call also arity. The type of 
/ in T is the pair (a(f),a(f)) that we shall rather write a(f) — ► cr(f), or 
(si, S2, . . . , s n ) — > s if a(f) = (si, • • • , s„) and er(/) = s. The sequence a(f) may 
be empty (that is, n = 0), in which case / is called a constant of type o~(f) = s. 

An T -algebra is an object M = ((M s ) se s, (/a/) fer), where for each s G S, M s 
is a non-empty set, called the domain of sort s of M. For a nonempty sequence 
of sorts \i = (si, • • • , s n ), we denote by M M the product M Sl x M S2 x • • • x M Sn . If 
p(f) > 0, then fu is a total mapping from M a (t> to M a (f\. If / is a constant of 
type s, then fu is an element of M s . The objects /m are called the operations 
of M. We assume that M s n M s / = for s^s'. We also let M denote the union 
of the M s (s G S). For d G M, we let a(d) denote the unique s G S such that 
d G M s - 

A mapping h: M — > M' between ^"-algebras is a homomorphism (or T - 
homomorphism if it is useful to specify the signature) if it maps M 5 into M' s for 
each sort s and it commutes with the operations of T. 
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We denote by T{!F) the set of finite well-formed terms built with F (we will 
call them J 7 -terms), and by T(T) S the set of those terms of sort s (the sort of a 
term is that of its leading symbol). If T has no constant the set T[T) is empty. 

There is a standard structure of ^-algebra on T{fF). Its domain of sort s is 
T(J-) S , and T(T) can be characterized as the initial J- -algebra. This means that 
for every ^"-algebra M, there is a unique homomorphism voIm'-T{J-) — > AI . 
If t G T(J 7 ) 5 , the image of t under vahi is an element of M s , also denoted by 
tnj- It is nothing but the evaluation of t in M, where the function symbols are 
interpreted by the corresponding functions of M. One can consider t as a term 
denoting tM, and tu as the value of t in M. The set of values in M of the 
terms in T(F) is called the subset generated by T. We say that a subset of M 
is finitely generated if it is the set of values of terms in T^ 1 ) for some finite 
subset T' of T. 

Let T be an S-signature, T' be an S'-signature where S' C S. We say that 
J 7 ' is a subsignature of J 7 , written JF' C T , if J 7 ' is a subset of J 7 and the types 
of every / in T' are the same with respect to T and to T' . We say then that an 
JF'-algebra M' is a subalgebra of an ^-algebra M if M s ' C M s for every s G S', 
and every operation of M' coincides with the restriction to the domains of M' 
of the corresponding operation of M. 

We will often encounter the case where an ^-algebra M is also the carrier 
of a ^-algebra, and the ^-operations of M can be expressed as T-teims: in that 
case, we say that the ^-operations of M are T -derived, and the (/-algebra M is 
an T -derived algebra (or it is derived from M). 

More formally, an S- sorted set of variables is a pair (X, a) consisting of a 
set X and a sort mapping a: X — ► S (usually denoted simply by X). We let 
T{T ' , X) be the set of (jFUX)-terms written with fUX, where it is understood 
that the variables are among the miliary symbols (constants) of FUX. Tiff ', X) s 
denotes the subset of those terms of sort s. Now if X is a finite sequence of 
pairwise distinct variables from X and t G T(JF, X) s , we denote by tM,x the 
mapping from M a rx) to M s associated with t in the obvious way (cr(A') denotes 
the sequence of sorts of the elements of X). We call tM x a derived operation of 
the algebra M . If X is known from the context, we write tM instead of tu,x- 
This is the case in particular if t is defined as a member of T{? ', {x\, ■ ■ ■ , Xk}) ■ 
the sequence X is implicitly (x\, ■ ■ ■ , Xk)- 

2.2 Recognizable subsets 

Let T be an S-signaturc. An ^"-algebra M is locally finite if each domain M s is 
finite. If M is an ^"-algebra and s G S is a sort, a subset L of M s is M- recognizable 
if there exists a locally finite ^"-algebra A, a homomorphism h: M — ► A, and a 
(finite) subset C of A s such that L = /i _1 (C). 

We denote by Rec(M) s the family of M-recognizable subsets of M s . In 
some cases it will be useful to stress the relevant signature and we will talk of 
JF-recognizable sets instead of M-recognizable sets. 

An equivalent definition can be given in terms of finite congruences. A 
congruence on M is an equivalence relation w on M = Uses -^ s ' sucn that each 
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set M s is a union of equivalence classes, and which is stable under the operations 
of M. It is locally finite if for each sort s, the restriction m s of m to M 5 has 
finite index. A congruence saturates a set if this set is a union of classes. A 
subset L of M s is M-rccognizable if and only if it is saturated by a locally finite 
congruence on M. 

The following facts are easily verified from the definition of rccognizability 
or its characterization in terms of congruences (see |12|V and will be used freely 
in the sequel. 

Proposition 2.1 Let M be an T-algebra. 

• For each sort s, the family Rec(M) s contains M s and the empty set, and 
it is closed under union, intersection and difference. 

• If h is a unary derived operation of M or a homomorphism of M' into 
M, (where M' is another J- -algebra), then the inverse image under h of 
an M -recognizable set is recognizable. 

• If N is a Q-algebra with the same domain as M , and if every Q-congruence 
of N is an T -congruence of M (e.g. N is derived from M, or Q is ob- 
tained from T by adding constants), then every M -recognizable set is N- 
recognizable. If in addition Q contains T , then M and N have the same 
recognizable subsets. 

• If M' is a subalgebra of M and L is an M -recognizable set, then L H M' 
is M' -recognizable. This includes the case where M' has the same domain 
as M , and is an T' -algebra for some subsignature T' of T . 

• Suppose that M is generated by T and let vclIm be the evaluation homo- 
morphism from T{F) onto M . A subset L of M s is J 7 -recognizable if and 
only if valfrj(L) is a recognizable subset of T(J-). If in addition T is 
finite, then this is eguivalent to the existence of a finite tree- automaton 
recognizing val], (L). 

Example 2.2 On the set of all words over a finite alphabet A, let us consider 
the binary operation of the concatenation product, and the unary operation u t— » 
u 2 , which is derived from the concatenation product. Then the 3rd statement 
in Proposition 12.11 shows that we have the same recognizable subsets as if we 
considered only the concatenation product. It is interesting to note that, in 
contrast, adding the operation u i— > u 2 to the signature adds new equational 
languages, e.g. the set of all squares. □ 

We will see more technical conditions that guarantee the transfer of recog- 
nizability between algebras in Section |2~H below. 
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2.3 Remarks on the notion of recognizability 

We gather here some observations on the significance of recognizability. 

First, we note that if / is an operation of an JF-algebra M, with arity k, 
and if Bi, . . . , B^ are M-recognizable, then f(B\, . . . , B^) is not necessarily 
recognizable. This is discussed for instance in |1U| . where sufficient conditions 
are given to ensure that f{B\, . . . ,-£?&) is recognizable. It is well-known for 
instance that the product of two recognizable subsets of the free monoid (word 
languages) or of the trace monoid is recognizable; a similar result holds for 
recognizable sets of trees. 

Now, let M be an ^F-algebra and let J-' be a signature which differs from 
J- only by the choice of constants and their values. In particular, T' may be 
obtained from T by the addition of countably many new constants. Then the 
congruences on M are the same with respect to T and to T 1 and it follows that 
a subset of M is JF-recognizable if and only if it is jT'-recognizable. 

It is customary to assume that the .F-algebra M is generated by the signature 
T . If M is a countable JF-algebra that is not generated by J-, we can enrich T 
to T' by adding to T one constant of the appropriate sort for each element of 
M. Then T' generates M (in a trivial way). As noted above, M has the same 
T- and jF'-recognizable subsets. If L is one of these subsets, the set val^(L) 
of .F'-terms is recognizable but we cannot do much with it, because we lack the 
notion of a finite tree-automaton. See the conclusion of the paper for a further 
discussion of this point. 

Finally, we can question the interest of the notion of a recognizable set. Is 
it interesting in every algebra? The answer is clearly no. Let us explain why. 

If the algebraic structure over the considered set M is poor, for example 
in the absence of non-nullary functions, then every set L is recognizable, by 
a congruence with two classes, namely L and its complement. The notion of 
recognizability becomes void. 

Another extreme case is when the algebraic structure is so rich that there 
are very few recognizable sets. For an example, consider the set N of natural 
integers equipped with the successor and the predecessor functions (predecessor 
is defined by pred(O) = 0, pred(n + 1) = n). The only recognizable sets are N 
and the empty set. Indeed, if ~ is a congruence and if n ~ n+p for some n > 0, 
p > 0, then by using the function pred n + p — 1 times, we find that ~ 1. 
It follows (using the successor function repeatedly) that any two integers are 
equivalent. 

Intuitively, if one enriches an algebraic structure by adding new operations, 
one gets fewer recognizable sets. 

For another example, let us consider the monoid {a, b}* of words over two 
letters. Let us add a unary operation, the circular shift, defined by : sh(l) = 1 
and sh(au) = ua, sh(bu) = ub, for every word u. The language a*b is no 
longer recognizable w.r.t. this new structure, however recognizability does not 
degenerate completely since every commutative language that is recognizable in 
the usual sense remains recognizable in the enriched algebraic structure. 

It is not completely clear yet which algebraic condition makes recognizability 



"interesting" . 

2.4 Technical results on recognizability 

The statements in this section explain how to transfer a locally finite congruence 
from one algebra to another, possibly with a different signature, and hence how 
to transfer recognizability properties between algebras. Proposition 12. f I above 
contains examples of such results. 

The statements that follow will be used in the proof of some of our main re- 
sults, in Section^ They are, unfortunately, heavily technical in their statements 
(but not in their proofs. . . ) 

Lemma 2.3 Let T be an S-signature and let Q be a "J -signature. Let S be an 
J- -algebra and let T be a Q -algebra. Let also TL be a collection (Ti-t, s ) such that, 
for each t £ T and s £ S, 7i t , s consists of mappings from T t into S s with the 
following property: 

for each operation g G Q of type (ti, . . . , t r ) i— > t and for each h G 7it,s ; 
there exist sorts Si,...,S r G S, mappings hi G Wt^s; (1 < i < r) and 
an T -derived operation f of type (si,...,s,) i— > s such that, for every 
xi&Ti, . . ., x r £ T r , h(g(xi, . . .,x r )) = f(hi(xi), . . . , h r (x r )). 
Finally, let = be an J- -congruence on S and let rs be the equivalence relation 
defined, on each T t , by 

x « y if and only if h(x) = h(y) for every h G 7it.s; s G S. 

Then w is a Q-congruence on T. 

Proof. Let g be an operation in Q, of type (t x , . . . ,t r ) i— > t, and let x±,y% € 
T tl , . . . , x r , y r £ T tr such that Xi ~ y i for each i = 1, . . . , r. Let also h G 7i t ,s 
with s G S. 

By hypothesis, there exist sorts Si,...,s r G S, mappings hi G Ti-u^i (for 
i = 1, . . . , r) and an JF-derived operation / of type (si, . . . , s r ) i— y s such that 

h(g(xi, . . . ,x r )) = f(hi(xi),...,h r (x r )) 
h{g(.yi,---,y r )) = f(hi(yi),...,h r (y r )). 

Since Xj ~ yi for each i, we have hi(xi) = hi(yi); and since = is an ^"-congruence, 
it follows that h(g(x±, . . . , x r )) = h(g(yi, . . . , y r ))- Thus we have g(x\, . . . , x r ) as 
g(yi, ■ ■ ■ , y r ), which concludes the proof. □ 

With the notation of Lemma [2.31 for each sort t G T, let < t be the quasi- 
order relation defined on Ti. t = lJ 5eS 7it,s by 

h <t h if there exists an ^-derived unary operation / such that h' = f o h. 

Lemma 2.4 With the notation of Lemma \2.tA if for each t the order rela- 
tion associated with < t has a finite number of minimal elements, and if the 
^-congruence = on S is locally finite, then the Q-congruence w on T is locally 
finite. 
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Proof. Let t G T. We want to show that there are only finitely many ^-classes 
in T(t). By assumption, there exist elements hi, . . . , hk G H t such that every 
mapping of H t is of the form f o hi for some 1 < i < k and some ^"-derived 
operation /. 

For each i, let 5 Sj be the range of hi and let rii be the number of ^-classes 
in if? S4 . It is immediately verified from the definition of <t that if x, y G T t , then 
x ~ y if and only if hi(x) = hi(y) for each 1 < i < k. In particular. T t has at 
most ni • • • n-k ^-classes, which concludes the proof. □ 

We will actually need even more technical versions of these lemmas. 

Lemma 2.5 Let S, T, T ', Q and TC be as in Lemma \2.ift and let £ be a Q- 

congruence on T such that: 

for each operation g G Q of type (ti,...,t r ) t, for each h G Ht, s 
and for each z = (zi,...,z r ) where each z% is a ^-class of T ti , there 
exist sorts Si t g, . . . , s r> g G S, mappings h^g G Ht i)Si ^ (1 < i < t) and 
an T-derived operation fg of type (si ; ^, . . . , s r _g) i— > s such that, in T , 
h(g(xi, . . .,x r )) = fg(hi t g(xi), . . . , h r ,g(x r )) if each x { is in Zi. 

Finally, let = be an J- - congruence on S and let « be the equivalence relation 

defined, on each T t , by 

x w y if and only if x C, y and h{x) = h(y) for every h G Ht,s, s G S. 

Then ps is a Q- congruence on T. Moreover, if TL satisfies the hypothesis of 
Lemma \2.4\ and = and £ are locally finite, then w is locally finite as well. 

Proof. The proof is the same as for Lemmas 12.31 and 12.41 □ 



3 Algebras of relational structures 

Even though we are ultimately interested in studying sets of graphs, it will be 
convenient to handle the more general case of relational structures. Further- 
more, relational structures can be identified with simple directed hypcrgraphs. 
Such hypcrgraphs form a natural representation of terms. See for instance the 
chapter on hypcrgraphs in |15j for applications. 

In this paper, all graphs and structures are finite or countable. Our proofs 
will not usually depend on cardinality assumptions on the graphs or struc- 
tures, and hence our results will hold for finite as well as for infinite graphs or 
structures. However, recognizability in the algebraic sense we defined, is really 
interesting only for dealing with finitely generated objects, and hence for finite 
graphs and structures. For dealing with infinite words, trees and graphs, other 
tools are necessary, see for instance 0U1 IH9 EH1 00] ■ 
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3.1 Relational structures 



Let R be a finite set of relation symbols, and C be a finite set of miliary symbols. 
Each symbol r E R has an associated positive integer called its rank, denoted by 
p(r). An (R, C)-structure is a tuple S = (D s , (rs) r £R, (cs)cec) such that D s is 
a (possibly empty) set called the domain of S, each rg is a p(r)-ary relation on 
Ds, i.e., a subset of D P g r \ and each cs is an element of Ds, called the c-source 
of S. 

We denote by StS(R, C) the class of (finite or countable) (R, C)-structures, 
and we sometimes write StS(R) for StS(R, 0). By convention, isomorphic struc- 
tures will be considered as equal. In the notation StS, St stands for structures, 
while the second S stands for sources. 

A structure S E StS(R, C) is source-separated if cs 7^ c' s for c ^ c! . We will 
denote by StS sep (R,C) the class of source-separated structures in StS(R,C). 
See Corollary 13.111 and Section Hi 5. 21 below. 

In order to handle graphs, we will consider particular kinds of structures in 
the sequel. We let E = {edge} be the set of relation symbols consisting of a 
single binary relation edge, intended to represent directed edges. Thus graphs 
can be seen as the elements of StS(E), also written Graph. Clearly these graphs 
arc directed, simple (we cannot represent multiple edges) and they may have 
loops. For a discussion of graphs with multiple edges, see Section [7| 

Wc let QS(C) denote the set StS(E,C). These structures are called graphs 
with sources. We let GS sep (C) denote the intersection QS{C) PI StS sep (R, C). 

We will discuss also graphs with ports (Section^J: if P is a finite set of unary 
relation symbols called port labels, then we denote by Ep the set of relational 
symbols E U P and by QV{P) the class StS(Ep). Port labels are useful for 
studying the clique- width of graphs, see |18M19| and Remark |4 . 1 II below . 

3.2 The algebra StS 

Wc first define some operations on structures. 

Disjoint union Let C and C be disjoint sets of constants and let S G 
StS(R,C) and S' G StS(R' ,C). Let us also assume that S and S' have dis- 
joint domains. We denote by S © 5" the union of S and 5", which is naturally 
a structure in StS(R UR',CUC'). 

If S and 5" are not disjoint, we replace S' by a disjoint copy. We need not 
be very precise on how to choose this copy because different choices will yield 
isomorphic ©-sums, and we are interested in structures up to isomorphism. 

Remark 3.1 It is also possible to define a similar operation, without the re- 
striction that C and C are disjoint (as in, say, 0^1)- See Section 13 . 5 . II below 
for a discussion. □ 
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Quantifier-free definable operations Our purpose is now to define func- 
tions from StS(R, C) to StS{R' , C) by quantifier-free formulas. We denote by 
QF(R, C, {xi, x n }) the set of quantifier-free formulas on (i?, C)-structures 
with variables in {xi, ...,x n }. 

A qfd operation scheme from StS(R, C) to StS(R' , C) is a tuple 

(5, ((Pr)r£R', (K c ,d)c£CAeC'), 

where S £ QF(R,C,{x}), <p r £ QF(R, C, {x±, ...,x p m}) if r is a p(r)-ary 
relation symbol, K c ,d £ QF(R,C,$), such that the following formulas are valid 
in every structure in StS(R,C), for all c, d £ C, d £ C" and r £ R' of arity 
p(r): 

• K Cjd A /v,d => c = d; 

• V ee c Ke ^' 

• «c,d =>■ <5(c); 

• V.Ti, . . ,,X p (r) \^p r (xi, . . .,X p (r)) =^ Af=l ^i))- 

The reason for these conditions becomes apparent with the following definition 
of the qfd operation g: StS(R, C) — * StS(R' , C) defined by such a scheme. Let 
S £ StS(R,C). The domain of the structure g(S) is the subset of the domain 
of S defined by formula 5 and the relation r (r £ R') on g(S) is described by 
formula ip r . Finally, if d £ C , then d g ^ — cs if c £ C and S satisfies n c ,d- The 
first two conditions imposed above assert that relative to S, c is uniquely defined 
for each d, the third condition asserts that d 9 (s) always lies in the domain of 
g(S), and the fourth condition asserts that the relation (p r (r £ R') can only 
relate elements of the domain of g(S). 

Remark 3.2 Note that in the first condition, c = d does not mean that c and 
d are the same constant, but that they have the same value in the considered 
structure. □ 



Remark 3.3 The conditions to be verified by a qfd operation scheme are de- 
cidable. It follows that the notion of a qfd operation scheme is effective. See 
the appendix (Remark I A. 41 in particular) for a discussion of this decidability 
result. □ 



Example 3.4 Let R be a finite set of relational symbols, C be a finite set of 
source labels and let a, b be source labels. We define the following operations. 

• if a £ C and b g" C, srcreria-,;, is the unary operation of type (R, C) — > 
(R, C \ {a} U {&}) which renames the a-source of a structure to a ^-source; 

• if a £ C, srcfg a is the unary operation of type (R, C) — > (i?, C\{a]) which 
forgets the a-source of a structure; 
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• if a ^ b e C, fus a: fc is the unary operation of type (R, C) — ► (R, C) which 
identifies the a-source and the ^-source of a structure (so the resulting 
domain element is both the a-source and the 6-source), and reorganizes 
the tuples of the relational structure accordingly. 

Note that the operation names srcren a _,b, srcfg a and fus a .& are overloaded: 
they denote different operations when the sets R and C are allowed to vary. A 
completely formal definition would use operation names such as srcren a ^b. fl.es 
which would be inconvenient. 

It is immediately verified that the operations of the form srcreria^f, and srcfg a 
arc qfd. It is probably worth showing explicitly a qfd operation scheme defining 
the operation fus a ^. 

Let S(x) be the formula (a = b) V ((a ^ b) A (x ^ a)). If r G R has arity 
p(r) = n, let tp r (xi, . . . , x„) be the formula 

((a = b) Ar(x!,...,x n ) \ V 

((a £b) A \/ (/\(x i = b)/\/\(x i ^b)/Kr(y 1 ,...,y n )j), 

JC{l,...,n} i£l i<£I 

where for each 7, yi = a if i € / and yi = Xi otherwise. For each d S C such 
that d 7^ a and for each c € C, let k c ^ be the formula c = d; let Kt>, a be the 
formula true, and let n c . a be the formula c = a for each c ^ b. It is now routine 
to verify that the scheme (<5, (if r )r<ERi (i^c.d)c.dec) defines fus a j,. □ 

Remark 3.5 There is no qfd operation from StS(R) into StS(R', C)' if C ^ 0, 
because in the absence of constants in the input structure, we cannot define 
constants in the output structure. □ 

Example 3.6 The natural inclusion of StS(R,C) into StS(R',C) when R' 
contains R is a qfd operation in natural way: the formulas intended to define 
relations in R' \R are taken to be identically false. □ 

The signature S We define the algebra StS of structures with sources as 
follows. First, let us fix once and for all a countable set of relation symbols con- 
taining edge and countably many relation symbols of each arity, and a countable 
set of constants. In the sequel, finite sets of relation symbols R and finite sets 
of constants C will be taken in these fixed sets. The set of sorts consists of all 
such pairs (R, C). The set of elements of StS of sort (R, C) is StS(R, C). 

The signature S consists of the following operations (interpreted in StS). 
First, for each pair of sorts (R, C) and (R ! , C) such that C Pi C = 0, the 
disjoint union is an operation of type {(R, C), (R' , C')) -> (R U R', C U C). 
Note that we overload the symbol ®, that is, we denote in the same way an 
infinite number of operations on StS. Next, every qfd operation is a (unary) 
operation in S. 
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Finally, we observe that the signature S contains the natural inclusions of 
StS(R,C) into StS(R',C) when R 1 contains R, which are qfd (Example 13.6(1 . 

As for constants in S, one can pick a single source label a, and consider 
a single constant a, denoting the structure with a single element, which is an 
a-source, and no relations. Together with the operations in 5, this constant 
suffices to generate all finite relational structures. As noted in Section 1231 the 
choice of constants does not affect rccognizability. It only affects the generating 
power of the signature, but this is not our point in this paper. 

3.3 Elementary properties of StS 

We first consider the composition of qfd operations. 

Proposition 3.7 Qfd operations in StS are closed under composition (when- 
ever types fit for defining meaningful composition) . 

Proof. Let g:StS(R,C) — ► StS(R',C) and g':StS(R' ,C) — ► StS(R",C") 
be qfd operations, given respectively by the schemes (S, {ip^reR'i ( K c,d)cec.dec) 
and (5', «)refl" , W c ,d)ceC,d&C")- 

The composite <?' o g turns an (R, C)-structure into an (R", C")-structure. 

Let 6°, ip° (r £ R") and n a c d (c £ C , d £ C") be obtained from 5', ip' r and 
K 'c,d b y replacing every occurrence of r(yi, y p{r) ) (r £ R') by ip r (yx, . . . , y p ( r )); 
our formulas are now in the language of (R, C")-structures and we need to 
"translate" the constants d £ C into elements of C. However, this translation, 
a mapping from C to C, depends on the structure in which we operate. 

To reflect this observation, for each mapping h: C — > C, we let h(6°) be the 
conjunction of the formulas &h(d),d (A G C) an d the formula obtained from 8 
by replacing each occurrence of d (d £ C") by h(d). Finally, we let 6" be the 
disjunction of the h(5°) when h runs over all mappings from C to C. 

We proceed in the same fashion to define tp" and k" d for each r £ R" 
and each c £ C, d £ C" . Finally, if b £ C and d G C", we let X b , d = 
V c6 c«cA< d ). 

It is a routine verification that (5", {i/)") r £R", (^b,d)bec.deC") is a qfd oper- 
ation scheme, which defines the composite operation g' o g. This completes the 
proof. □ 

For each S £ StS(R,C), wc define the type of S, written ((S), to be the 
restriction of S to its set of sources. That is: the domain of ((S) is the set of 
C-sources of S, and the relations of ((S) are those tuples of C-sources that are 
relations in S. In order to simplify notation, we also denote by £ the equivalence 
relation on StS given by 

S ( T if and only if ((S) and C(T) are isomorphic. 

Lemma 3.8 Let S,T £ StS(R,C). Then S C T if and only if S and T satisfy 
the same formulas in QF(R, C, 0). 
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Proof. A formula in QF(R, C, 0) is a Boolean combination of atoms of the form 
c = d where c,d £ C, or r(x\ ,...,!„) where r £ R has arity n and the Xi are 
in C . It is immediate that such an atom is true in S if and only if it is true in 
C(S). Thus S and ((S) satisfy the same formulas in QF(R, C, 0): in particular, 
(■-equivalent structures satisfy the same formulas in QF(R,C,9). Thus, if we 
denote by Th^^ C (S) the set of formulas in QF(R 7 C, 0) that are satisfied by S 
(see Section EO))'. we find that Th™ c (S) = Th™ c {((S)). 

Conversely, we observe that if S is a structure in StS(R, C), which consists 
only of its C-sources (that is, S — ((S)), then S is entirely described by some 
formula in QF(R,C,$). Thus, if ((S) ^ C(T), then Th^ c (S) ^ Th™ c {T). 
This suffices to conclude the proof. □ 

The type relation £ has the following important property. 

Proposition 3.9 The type relation f is a locally finite congruence on StS. 

Proof. The verification that ((S © 5') = ((S) ® ((S') (S £ StS(R,C), S' £ 
StS(R' ,C) and C Pi C = 0) is immediate. Let us now consider a qfd op- 
eration g:StS(R,C) — > StS(R' ,C"), specified by the qfd operation scheme 
(5, (Vv)reft') (n>c,d)cec,dec')- By Lemma S and ((5 1 ) satisfy the same for- 
mulas of QF{R,C,$). In particular, for each c £ C and d £ C, S and £(S) 
both satisfy k Cj( ;, or both satisfy its negation. Thus g(S) and g(C(S)) have the 
same sources, and hence ((g(S)) = ((g(((S))). 

We have just shown that the type relation is a congruence. To complete 
the proof, it suffices to show that for each sort (R, C), the set of types of sort 
(R, C), that is, the set C,{StS{R, C)) is finite. Note that if S £ StS(R, C), then 
((S) has cardinality at most card(C) (and also at most card(£)). It follows that 
card(((StS(R, C))) < card(C)! UreR 2™^^ . □ 

Remark 3.10 Proposition 13.91 can be seen as a particular case of a result of 
Fcferman and Vaught Theorem 13. 121 below, which will be used in Section 
|SJ The simple formulation above will be very useful. □ 

Note that the knowledge of ((S 1 ) is sufficient to determine whether S is 
a source-separated structure. This observation is used to prove the following 
corollary. 

Corollary 3.11 Let (R,C) be a sort in StS. Then StS sep (R,C) is a recogniz- 
able subset of StS{R,C). 

Proof. Whether a structure S is source-separated depends only on its type 
({S): in particular, the type congruence ( saturates StS sep (R,C). By Propo- 
sition 1331 this relation is a locally finite congruence, and hence StS sep (R,C) is 
recognizable. □ 
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3.4 A result of Feferman and Vaught 

If (R, C) is a sort of StS, we denote by FO(R, C) the set of closed first-order 
formulas over R and C. For each integer d, we denote by FOd{R 1 C) the set 
of those formulas of quantifier-depth at most d. Up to a decidable syntactic 
equivalence (taking into account Boolean laws, properties of equality, renaming 
of quantified variables, see Appendix^), there are only finitely many formulas 
in each set FOd(R, C). Thus, we can reason as if FOd(R, C) was actually finite. 

For an (R, C)-structure S, we let its FOd-theory be the set Th F( ^ C (S) of 
formulas in FOd(R, C) that are valid in S. It is finite since it is a subset of the 
finite set FO d (R,C). 

Theorem 3.12 Let d>0. 

(1) For every qfd operation f of type (R, C) — > (R' , C), there exists a mapping 
ft such that, for every (R,C) -structure S 

Th™^{f{S)) = f*{Th F d ° R AS))- 

(2) For every (R,C) and (R',C), where C and C are disjoint, there exists 
a binary function ©^ such that, for every (R,C)- structure S, and every 
(R',C)- structure S' , 

ThdjzuR' ,cuc(S © ^") = Thafft c^) ®f Th^j i , c ,{S'). 

Remark 3.13 The second assertion was proved in for first-order logic, and 
extended by Shclah to monadic second-order logic 02] ■ The importance of this 
result is discussed by Makowsky in (201 • ^ 

Remark 3.14 The functions ft and have finite domains and codomains. 
However these sets are quite large. These functions can be (at least in principle) 
effectively determined for given (R, C), (i?', C"), and d. □ 

3.5 Variants of the algebra of relational structures 

In the literature on recognizable and equational graph languages, several vari- 
ants of the signature S and the algebra StS are considered, notably a variant 
where the definition of the disjoint union is replaced by a more general parallel 
product, and a variant where all structures are assumed to be source-separated. 
We verify in this section that these variants do not yield different notions of 
recognizability. 

3.5.1 Parallel composition vs. disjoint union 

In the literature (e.g. OEDi the operation of disjoint union © is sometimes 
replaced by the so-called parallel composition (or product), written ||, an oper- 
ation of type ((R, C), (R', C')) ->• (RU R', C U C) for which we do not assume 
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that C and C are disjoint. If S £ StS(R, C) and S' £ StS(R', C), the parallel 
composition S \\ S' is obtained by taking the (set-theoretic) disjoint union of S 
and S' and then identifying the c-sources of S and S' for each c £ CflC". Let 
<S|| denote the signature obtained from S by substituting || for ffi. 

Proposition 3.15 Let L be a subset of StS. Then L is S -recognizable if and 
only if it is S\\ -recognizable. 

Proof. We first observe that the operation is a particular case of || . Therefore 
S is a sub-signature of <S|| and hence, every <S|| -recognizable set is 5-recognizable. 

To prove the converse, it suffices to verify that || is an 5-derived operation 
by PropositionO Indeed, if S £ StS{R, C) and S' £ StS(R', C), the parallel 
composition S || S' can be obtained by the following sequence of S-operations 
(see Example 13.41 for their definition): 

- for each c £ C R C , apply the qfd operation srcren c ^ e which renames the 
c-source in S' with a new source label, say c, not in C; let S' be the resulting 
structure; 

- take the disjoint union S ffi 5"; 

- for each c £ CCiC, apply the operation fus c . e which identifies the c-source 
and the c-source in S © S'; 

- apply the source-forgetting operation srcfg E for each c £ C D C □ 

3.5.2 Source-separated structures 

The property that cs ^ c' s for c ^ c' is called source separation. This property 
makes it easier to work with operations on structures and graphs, and hence we 
discuss a variant of the 5-algcbra StS, which handles source-separated struc- 
tures. We will also use it in Section 

Recall that StS sep (R, C) denotes the set of source-separated structures in 
StS(R,C). We now define a subsignaturc iS> sep of S such that StS sep is a sub- 
algebra of StS. 

Disjoint union © clearly preserves source separation, and is part of iS se p. 
Next we include in S sep the operations specified by qfd operation schemes such 
that, for each c £ C and d ^ d' £ C (see the notation in Section l3~2|) . 

Hc.d ^K>c,d', (1) 

which guarantees that the operation preserves source separation. 

Example 3.16 The operations srcren a ^h and srcfg a defined in Example 13. 41 are 
in S sep . The operation fus a .h defined in the same example is not. 

In contrast, the operation written fus a ^b, which identifies the a-source and 
the &-source of a structure as in fus aj b, and makes the resulting element of the 
domain a 6-source but not an a-sourcc, preserves source separation. It can be 
written as fus a ^b = srcfg a o fus aj f,. 

The operation which, given a graph with source labels a and b, exchanges 
the source labels a and b if the corresponding vertices are linked by an edge and 
does nothing otherwise, is another example of a qfd operation in £> se p. □ 
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Regarding the effectiveness of the definition of 5 sep , we observe the following. 

Proposition 3.17 Given a qfd operation scheme, one can decide whether the 
corresponding qfd operation preserves source separation. 

Proof. Let g be the qfd operation specified by the given qfd operation scheme, 
and let StS(R, C) be the domain of g. One can effectively construct the images 
under g of every type in StS(R, C), since there are only finitely many of them, 
and they can all be enumerated. One can then verify whether the operation 
preserves souce-separation on types. 

Now it follows from the proof of Proposition ETni that for each S £ StS(R, C), 
we have ((g(((S))) = ((g(S)). In particular, g preserves source separation if 
and only if it preserves it for the structures of the form ((S). Thus one can 
effectively decide whether g £ S sep . □ 

We now show that the restriction to source-separated structures does not 
change the notion of rccognizability. 

Theorem 3.18 Let L be a subset of StS sep . Then L is S -recognizable if and 
only if it is S sep -recognizable. 

Proof. By definition, iS sep is a subsignature of S, so every iS-recognizable set is 
iS sep -recognizablc. 

To prove the converse, wc first define a mapping h, which maps a structure 
S £ StS(R, C) to a source-separated structure h(S) £ StS sep (R, C) by splitting 
sources that were identified in S. 

We assume that the countable set of constant symbols (from which C is 
taken, sec Section l3~^|) is linearly ordered. Let /ijf : C — > C be given by 

/iq (c) = min{d £ C | c s = d s }. 

We let C$ = h$(C) and Cf = C \ C$ . The structure h(S) has domain set 
the disjoint union of S and Cf . For each c £ Cq , the c-source of h(S) is the 
c-source of S 1 , and for each c £ Cf , the c-source of h(S) is the element c £ Cf . 
Finally, for each r £ R, the relation rh(S) equals the relation r$ (so it does not 
involve the elements of Cf ). Observe that h is not a qfd operation, and that 
h$, Cq and Cf depend only on £(S). 

Now let L be an iS sep -recognizable subset of StS sep and let e be a locally 
finite iS sep -congruencc recognizing it. We need to construct a locally finite >S- 
congruence ~ on StS which recognizes L. 

The relation ~ on StS is defined as follows. If S,T £ StS(R,C), we say 
that S - T if C(S) = C(T) and h(S) = h(T). It is immediately verified that ~ 
is an equivalence relation. Moreover, the ~-class of a structure S is determined 
by its C-class, and by the EE-class of h(S). Since both £ and = are locally finite, 
~ also is locally finite. 
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Let us now prove that ~ is an 5-congruence. Let S ~ T G StS(R, C) and 
S' ~ V e StS(R',C), with C n G' = 0. By Proposition El ((S © 5') = 
C(T©T'). It is not difficult to verify that 

h(S®S') =h(S)®h(S'). 

It follows that h(S © S') = h(T © T") since © is an operation in S sep . Thus 
S © 5' - T © T'. 

Next let g be a qfd operation from StS(R,C) to StS(Q, B), given by the 
qfd operation scheme (<$, (ipq) q eQ, ( K c,b)cec.beB)- Let 5 and T be ^-equivalent 
elements of StS(R,C), which will remain fixed for the rest of this proof. We 
need to show that g(S) ~ g(T). We already know from Proposition 13.91 that 
if S ~ T € StS(R,C), then ((.9(5)) = C(g(T)), and we want to show that 
= fe(5(T)). 

Since £(g(S)) = £(g(T)), the mappings and h^', from S to S, co- 

incide. Let Bq = hf) (B) and B\ = B \ Bq. Without loss of generality, we 
may assume that Bx n C = 0. The domain set of h(g(S)) (resp. h(g(T))) is the 
disjoint union of the domain of 5(5*) (resp. g{T)) and Si. 

It suffices to show that there exists a qfd operation k G iS sep , depending on 
g and £(S), such that h(g(S)) = k(h(S) © S x ) and %(T)) = fc(/i(T) © S x ) 
(where Bi is the source-only element of iSiiS sep (0, Si)). Indeed, the fact that = 
is an iSsep-congrucnce will then imply that h(g(S)) = h(g(T)). 

Let 6' be obtained from S by replacing every occurrence of c G C by /iq (c) . 
For each g G Q, c G C and 6 G S, let ^ be obtained from -0 g and b be 
obtained from k Cj 6 in the same fashion. 

Let now k': StS(R, C U Si) — > StS(Q, B) be defined by the scheme 

(7', (Xq)q£Q, (K,b)ceCuB u beB) defined as follows: 

i{x) = (S'(x) A ^ -i(x = c)) V Y (x = b) 

cecf faeBi 
x' q = ip' q for each q E Q 
X' bb = true if b G Si 
A(. 5 = false if 6 G Si and c 7^ b 
X' cb = false if 6 G S and c G Cf 
A' C)6 = \/ K 'd,a [ib ^ B o and c G C S . 

/i» (s) (a)=i), h$(d)=c 

It is now a routine verification that (for our fixed structure S) k'(h(S) © 
Si) = h(g(S)). Since all our definitions depend only on ((S), we also have 
k'{h{T)®B x )=h{g{T)). 

One last step is required in this proof as the qfd operation k' may not 
preserve source separation for all structures, that is, kl may not lie in <S 5ep . It 
does for the particular structures h{S)®B\ and h(T) ©Si, but perhaps not for 
others. Actually, structures U such that ((U) ^ C(h(S) © Si) = ({h(T) © B x ) 
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do not matter in this context, so we can replace k' by the operation k, with 
the same domain and range as k' , which maps a structure U to k'(U) if C(U) = 
({h{S) ®Bi), and to the source-only source-separated structure B £ StS(Q, B) 
where all relations are empty. This new operation k preserves source separation 
by construction, and it is easily verified to be qfd. This completes (at last) the 
proof. □ 



4 The algebra QV of graphs with ports 

Graphs with ports were introduced in Section ETT1 Recall that if P is a set of 
unary relation symbols, then Ep denotes the set Ep = {edge} U P and the class 
of graphs with ports in P, written QV{P) can be identified with StS(Ep). Wc 
observe that a vertex of a graph with ports in P can be a p-port for one or 
several port labels p £ P. or for none at all. 

For convenience, we will consider that P is a finite subset of the set N of 
natural integers. 

4.1 The signature VR on graphs with ports 

We define the set of sorts of the algebra QV to be the set of finite subsets of N. 
For each such subset P, the set of elements of QV of sort P is the set QV(P) of 
graphs with ports in P. 

The signature VR consists of constants, unary operations and binary opera- 
tions. These operations (interpreted in QV) are as follows. 

First, if P, Q are finite subsets of N, then © is as in StS, and is thus a binary 
operation of type (Ep, Eq) — > PpyQ- In QV, we consider © as an operation of 
type (P, Q) -» P U Q. 

Next, the unary operations of VR are the following (clearly qfd) operations: 

• if p, q are distinct integers, add p . g is an operation of type P — > P for each 
sort P such that p, q £ P: it modifies neither the domain (the set of 
vertices) nor the unary relations p (p £ P); the new edge relation has the 
existing edges, plus every edge from a p-port to a g-port: it is given by 

edge (x,y) V (p(x) A q(y)); 

• if D is a finite subset of N x N, rndfu is an operation of type P — > Q where 
P is any finite set containing the domain of the relation D and Q is any 
finite set containing the range of D; it modifies neither the domain (set of 
vertices) nor the edge relation; for each q £ Q, the g-ports of the output 
structure are the vertices of the input structure that are p-ports for some 
p such that (p, q) £ D; that is, q(x) is given by V( P . 9 )e_D P( x )- 

Finally, for each integer p, we let p be the constant of type {p} denoting the 
graph with a single vertex, no edges, and whose vertex is a p-port. We also let 
pioop same graph, with a single loop. 
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Remark 4.1 The following operations on graphs with ports occur in the liter- 
ature, and are particular cases of VR-operations. 

Let p ^ q be integers, P be a subset of N containing p and Q = P\ {p} U {q}. 
The operation ren p _, g , of type P — > Q which renames every p-port to a g-port, is 
an operation of VR: it is equal to mdf_D where D = {(r, r) | r S P\{p}}U{(p, g)}. 
Observe that this operation fuses the sets of vertices defined by p and q. 

Let p be an integer, and let P be a subset of N containing p. The operation 
fg p , of type P — > P\ {p}, which forgets p-ports is an operation of VR: it is equal 
to mdfo where L> = {(r, r) \ r £ P \ {p}}. □ 

Remark 4.2 In our definition of graph with ports, an element of 9V(Q) does 
not need to have g-ports for each q G Q. Thus, if P C Q, every graph with 
ports in P can also be viewed as a graph with ports in Q. The natural inclusion 
of QV(P) into QV(Q) is part of the signature VR: it is equal to rndfu where 
D = {(p,p)\peP}. □ 

Remark 4.3 Again (as in Example 4fl . the operations introduced in this sec- 
tion are denoted by overloaded symbols. A formal definition should specify the 
type of the operation, and would read something like add p q p or rndfu. p.q- We 
prefer the more concise notation introduced here. □ 



4.2 A technical result 

The following result describes the action of a qfd operation on a disjoint union of 
structures. It is the key to the main results of this section, described in Section 
IO below. 

Proposition 4.4 Let £ be the type congruence (see Section Let h be 

a unary qfd operation on StS, from StS(R,C) to StS(EQ,9) = GV(Q), let 
Ci) and (i? 2 , C 2 ) be sorts of StS such that R = R 1 UR 2 , C x n C 2 = and 
C = C\ U C'2, and let z = (21,22) with z\ a (,-class in StS{R\ 1 C\) and 22 0, 
C^-class in StS(R2, C 2 ). 

Then there exist quantifier-free definable operations gig: StS(Ri, C\) — > GP(Qi,z), 
g 2t f.StS(R 2 ,C 2 ) gV{Q 2 ,s), and fr-GV(Q h gU Q 2 J) U GV(Q), such that 

• fz is a composition of unary operations in VR; 

• for each x\ G StS(Ri 1 G\) in class z\ and each x 2 € StS{R2,C 2 ) in class 
2 2 , h(xi © x 2 ) = fg{gi,s{x{] © g 2 ,z{ x 2))- 

Proof. Let (6, V'edge, {ipq)q£Q) be the qfd operation scheme defining the oper- 
ation h: here tp e dge defines the edge relation, ip q defines the g-ports (g G Q), 
and there is no formula of the form k c ^ since the range of h is in QViff) = 
StS(EQ,$). The formulas 6, ip e dge and i\i q , for q G Q, are in the language of 
(i?, C)-structures. 
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The atoms of S(v) are either of the form r(y±, . . . , y p ( r )) ( r G R), or v = c, or 
Ci = C2 (c, Ci, ca £ C). Let (5 1 be the formula obtained from 6(v) by substituting 
the Boolean value (false) for the following atoms, which are certainly false in a 
disjoint sum x\ ©£2, with x\ <= StS(R\, C\), X2 G StS{R,2, C2) and the variable 
v interpreted in x\. 

• each r-atom such that r $ R± and an argument of r is v or a constant in 
Ci; 

• each r-atom such that r R2 and an argument of r is a constant in C2; 

• each r-atom such that r £ R\ n i?2, an argument of r is a constant in C2, 
and another argument of r is w or a constant in C±; 

• each atom of the form y = c such that c £ C2 and y is equal to v or to a 
constant in C\. 

The remaining atoms in 8 1 are either in QF {Ri, Ci, {v}) or in QF(R2, C2, 0). 
Note that the £-class of an clement of StS(R2,C2) determines entirely which 
formulas in QF(i?2, C2, 0) it satisfies. For each z as in the statement of the 
proposition, we let <5 1,z be the formula in QF(Ri, C\, {v}) obtained from S 1 by 
replacing each atom in QF(i?2, O2, 0) by the Boolean value or 1 according to 
the C-class z 2 - We observe that if v is a vertex of x\ © X2 which happens to be 
in xi, then 

S(v) S 1,z (v) whenever the C-class of X2 is 2:2- 

For each q £ Q, let ipq ,z be defined similarly. Then we also have, if v is a vertex 
of x\ © X2 in x\, 

Tp q {v) <=> ipl' z ( v ) whenever the ^-class of X2 is z 2 - 

Let also S 2,z and ip^' z be defined dually. And again, if i,j G {1,2}, we let 

^edge( u ' w ) ^ e ^ ne formula obtained from ipedge by substituting the Boolean value 
for the atoms that are certainly false in a disjoint sum x\ © X2 for the variable 
v interpreted in x% and the variable w interpreted in xy. 

• each r-atom such that r Ri and v is an argument of r; 

• each r-atom such that r i?j and w is an argument of r; 

• each r-atom such that r R\ and a constant in C\ is an argument of r; 

• each r-atom such that r ^ R2 and a constant in C2 is an argument of r; 

• each r-atom such that r S i?i D R2, an argument of r is a constant in C2, 
and another argument of r is a constant in C\ ; 

• each r-atom such that r S R\ l~l i?2 ; an argument of r is w (resp. u>) and 
another argument of r is a constant in Cz-i (resp. Cs-j); 



22 



• each atom of the form v = c with c £ C3—;, w = c with c £ C3-J1 or 
Ci = C2 with ci £ Ci and C2 G C2; 

• if i 7^ j, each r-atom such that r £ i?i n i?2, and 1; and w are arguments 
of r. 

As above, the remaining atoms in ipldge arc m QF(Ri, C\,{v, w})UQF(R2, 6*2,0), 
and for each z, we let ^edgef De obtained from ip e ^ ge by substituting the Boolean 
values or 1 for the atoms in QF(R2, C%, 0) according to the £-class z 2 . If i>, w 
are vertices of xy ® X2 in x% , and if the £-class of 22 is Z2, then 

■0edge(w,w) ^ld g e(v,w). 

We define V'edge 2 similarly, and get the analogous equivalence. 

If i ^ j, the atoms of VCdge are m QF(Ri, {v}) and in QF(Rj,Cj 7 {w}) 
- which may include atoms in QF(Ri,C\,9) and in QF(i?2, C2, 0). Again, we 
let V'edge ^ e obtained from V'edge by substituting the Boolean values or 1 for 
the atoms without free variables according to the (^-classes z\ and z-x. And we 
observe that if v,w are vertices of x\ © X2, v is in x% and in the £-class Zi, w is 
in Xj and in the £-class zj, then 

V>edge(«,w) ^dgeO"'™)- 

Now let k = 1 + max(Q), let Xk+i, ■ ■ ■ , Xi be an enumeration of the subsets 
of QF(Ri, Ci, {y}), and let Y^+i, . . . , Y m be an enumeration of the subsets of 
QF(R2, C2, {y})- Let us denote by Q\ the set Q U {k + 1, . . . , £} and by Q2 the 
set QU{£+1,...,to}. 

We define the qfd operation gi ^.StS(Ri,Ci) — > QV{Q\) defined by the 
following operation scheme: 

S 1 '*, ^dgf- ^(?eQ), 0„(fc + l<n<^) 

where for each k + 1 < n < £, n (v) holds if the set of quantifier- free formulas 
in QF(Ri,Ci, {y}) satisfied by v is exactly X n . 

Similarly, the qfd operation g 2 ,z : StS(R%, C2) — > QV[Q2) is defined by the 
operation scheme 

<5 M , . (9 G <?)' «n + 1 < n < m) 

where for each £ + 1 < n < m, 9 n {v) holds if the set of quantifier-free formulas 
in QF(R2, C2, {y}) satisfied by v is exactly X n . 

Finally, we consider structures x\ e StS{R\,C\) and x 2 € StS(R2, C2), 
with ^-classes respectively Z\ and z%, and we compare the graphs with ports 
9i,z{x\) © 52,2(^2) and h{x\ © 212). The above remarks show that these two 
graphs have the same set of vertices, the same g-ports (q € Q), and the same 
edges between two vertices of x\ or two vertices of X2- On the other hand, 
9i,z{x\) © 32,2(^2) misses the edges of h(x± ffi X2) that connect a vertex of x\ 
with a vertex of 22- 
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These edges are captured by the formulas Vd-igef an d ^Idge ■ Now, if v is a 
vertex of x\ and w is a vertex of X2, we already observed that the truth values 
of ipl^e ( v ' w ) an d ^edge( w > v ) are entirely determined by the quantifier- free 
formulas with one free variable satisfied by v in x\ and by w in X2- that is, they 
are entirely determined by the (unique) index k + 1 < n < £ such that 9 n (v) 
and by the (unique) index £ + 1 < n < m such that 9 n (w). In other words, 

V , edge Z ( a ' ^) ail d "0edge 2 (^' a ) are equivalent to disjunctions of conjunctions of the 
form 

9 n (a) A u (b) for some k + 1 < n < £ and £ + 1 < u < m. 

Thus the edges in h{x\ © CC2) from a vertex of x\ to a vertex of xi can be 
created from gi : z{xi) © 92, 2(^2) by applying repeatedly the operations (in VR) 
of the form add„ jU such that n £ [k + 1,£], 9„ A 9 U is a disjunct of V'edge • 

Similarly, the edges in h{x\ © 0:2) from a vertex of £2 to a vertex of xi can 
be created from g\ } z{x\) © 32,2(^2) by applying the appropriate operations of 
the form add u . n . The last operation consists in forgetting the auxiliary ports 
numbered fc+1 to m, that is, in applying the operation mdf£), with D = {(q, q) \ 
qeQ}. □ 



4.3 Recognizable sets of graphs with ports 

In this section, wc consider different notions of recognizability that can be used 
for sets of graphs with ports. Let L C QV{P). Then L can be VR-rccognizablc, 
as a subset of the VR-algebra QV . It can also be 5-recognizable, as a subset 
of the 5-algebra StS since QV{P) = StS(Ep). Finally, we introduce another 
signature, written VR + , on QV: it is obtained from VR by adding all the qfd 
operations between the sorts of QV. 

Theorem 4.5 Let P be a finite subset of N and let L be a subset of QV(P). 
The following properties are equivalent: 

1 L is S -recognizable; 

2 L is VR + -recognizable; 

3 L is VR-recognizable; 

Proof. Since the operations of VR are operations of VR + , and the operations 
of VR + are operations of <S, it follows from Proposition 12 . II that (1) implies (2), 
and (2) implies (3). Thus, we only need to verify that (3) implies (1). 

We use Lemma with T = VR, S = QV, Q = S, T = StS, and C the 
type congruence (see Section 13. 311 , which relates structures with sources of the 
same sort, provided they satisfy the same quantifier- free formulas. We use the 
collection 7i of sets 7i(i?,c).p of unary qfd operations from StS(R, C) to QV(P). 

Let L be a VR-recognizable subset of QV(P) and let e be a locally finite 
VR-congruence on QV such that L is a union of =-classes. Since C is a locally 
finite 5-congruence on StS (Proposition I3.9fl . its restriction to QV is also a 
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locally finite VR-congruence; and the intersection of = and £ is a locally finite 
VR-congruence on QV which saturates L. Thus we can assume, without loss of 
generality, that =-equivalent elements of QV are also ^-equivalent. 

Next we consider the equivalence relation ps on StS defined as in Lemma l2.5l 
Note that the identity of QV{P) belongs to "H^Ep^.Pj so that ^-equivalent ele- 
ments of QV{P) = StS(Ep,$) are also =-equivalent. In particular, ss saturates 
L and it suffices to show that ~ is locally finite and is a 5-congrucnce. In view of 
Lemma 031 it is enough to verify that H satisfies the assumptions of Lemma l231 
and 1^1 

We first verify the hypothesis of Lemma 12.31 Let g be an operation of S: 
either g is a unary qfd operation or g = ©. In the latter case, Proposition 14.41 
states precisely that the required property holds. 

If g is a qfd operation of type (Ri,C±) — > (R,C), and h £ 'H[b,,c),p^ then 
h o g is a qfd operation (Xemma l3.7|l and hence, hi = h o g g Ti.^.c^.p- Now 
letting / be the identity mapping of QV{P), we find that h(g(x)) = f{h\(x)) as 
required. In this case, hi and / do not depend on the f-class of x. 

Next, we turn to the verification of the hypothesis of Lemma [2.41 Let (pi, 
. . . , (fit be an enumeration of the elements of QF(R, C, {x}) and let xi, ■ ■ ■ , Xt 
be an enumeration of the elements of QF(R, C, {x, y}). 

Thus, a qfd operation scheme from StS(R, C) into QV(Q) consists in the 
choice of a formula 8 = ip ia (1 < i < k), a formula '(/'edge = Xj (1 < J < a 
sequence of formulas , . . . , tfi r (1 < i\ < . . . < i r < fc), and a partition of Q 
as Q = Q\ U • • • U Q r : if q £ Qj, then ip q = ifi . (If Q = 0, then r = 0.) 

Let us now consider two unary qfd operations g: StS(R, C) — > QV{ff) and 
g': StS(R,C) — > GP{Q'), associated with the same choice of values io, j and 
ii < . . . < i r . Let Q = Qiil- ■ - LiQ r and Q' = Q[U- ■ - UQ' r be the corresponding 
partitions of Q and Q' . Finally let tt, ttq, 7Ti, . . . , 7r r be the following operations 
in the signature VR. These operations have the common particularity to not 
alter the graph structure, and to modify only the port predicates. 

The mapping ttq shifts every port index of an element of QV{Q) by m, = 
max(Q'), to yield a graph with ports in Q + m, whose port names do not 
intersect Q'. We let Rh = Qu + m for I < h < r. 

For 1 < h < r, Wh = mdfD h where 

D h = {(a, a) | a e |J Q\ U (J Ri} U x Q' h ). 

i<h i>h 

Thus 7T/j turns a graph with ports in Q[ -\ h + -R/i + -R/i+i + ■ ■ ■ R r into a 

graph with ports in Q[ H h + Q' h + -Rfi+i + • • ■ i? r , with the same vertex 

set, the same edge relation, the same g-ports for each q £ \J i<h Q\ U \J i>h Ri, 
and with each r-port (r G i?^) turned into a g-port for each q £ Q' h . 

It is now an easy verification that, if tt = 7r r o- • -O7ri07ro, then g'(x) = 7r(g(a;)) 
for each x € StS(R,C). Thus the quasi-order <^c) defined in Lemma f2. 41 is 
in fact a finite index equivalence relation, and this concludes the proof. □ 

Remark 4.6 This actually proves also that we get the same recognizable sets 
of graphs with ports, if we consider QV{Q) as a domain of sort Q in the alge- 
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bra of structures without sources — which consists of the domains StS(R 7 0) 
equipped with the operations of S between them. If we were only interested 
in the equivalence of this recognizability with VR- and VR + -recognizability (or 
just the equivalence between VR- and VR + -recognizability), we could do with 
Lemmas 12. 31 and 12.41 instead of Lemma |2~51 and with a simpler version of Propo- 
sition 1231 making no reference to £• n 



4.4 Variants of the algebra of graphs with ports 

The first variant considered here replaces the signature VR by a smaller sig- 
nature, which we will see is equivalent to VR in terms of recognizability. The 
second one concerns a certain class of graphs with ports, and is central in the 
definition of the clique-width of a finite graph. 

4.4.1 A variant of VR on QV 

In Section 14.31 we exhibited signatures larger than VR, for which all the VR- 
rccognizable sets of graphs with ports are recognizable: namely the signature 
VR + on QV and the signature iS on the wider algebra StS. In contrast, wc 
exhibit in this section a smaller signature (in fact, a signature consisting of 
VR-derived operations) which does not create new recognizable subsets. 

The basic idea behind the definition of this new signature is the following: 
when we evaluate a VR-term t of the form add p , q (t'), then we add edges from 
each p-port of G", the value of t\ to each of its g-ports. It may happen that 
some edges from a p-port to a g-port already exist in G' . In this case, we do 
not add a parallel edge since we are dealing with simple graphs. Thus the term 
t presents a form of redundancy, since some of its edges may be, in some sense, 
defined twice. 

For disjoint sets of port labels P and Q, we denote by J(P, Q) the set of 
VR-dcrived unary operations defined by terms of the form /i(/2(- ■ • (fn(x)) ■■■)), 
where the fi are of the forms add p(j or add q p for p in P and q in Q. Since the 
operations add pq are idempotent and commute with one another, an operation 
in J(P 7 Q) is completely described by a subset of (P x Q) U (Q x P). Thus 
J{P, Q) is finite, although one can write infinitely many terms specifying its 
elements. For each element J £ J(P, Q), we let ®j denote the binary operation 
defined, for G e QT{P) and H e QV{Q), by G®,jH = J{G © H). 

Wc observe that in the evaluation of a term of the form t®jt', the application 
of ®j does not recreate edges that already exist in G, the value of t, or in G' , the 
value of t' since the add Pi<? operations forming ®j add edges between the disjoint 
graphs G and G' (because p and q are not port labels of the same argument 
graphs). 

Now the signature NLC consists of the operations <E)j as above, the unary 
qfd operations of the form fg p and rsrip — >q cis defined in Remark 14.11 and the 
constants p and p loop as in VR. We denote by Qp NLC the NLC-algebra of graphs 
with ports. 
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Remark 4.7 The notation NLC refers to a very similar algebra used by Wanke 

03| . □ 



Example 4.8 We have in fact already encountered IM LC-operations and IMLC- 
derived operations. 

The VR-derived operation fg whose existence is proved in Proposition 14.41 
is actually NLC-derived. Consider indeed the last paragraphs of the proof of 
that proposition: the operation fg is obtained by first composing operations 
of the form add„ jU and add„. n , where the pairs (n,u) lie in a certain subset of 
[k+1, £] x [£+1, rn] and the pairs (u, n) lie in another subset of [£+1, m] x [fc+1, £), 
and then composing operations of the form fg p . 

One can also check that the operations ttq, ■ ■ ■ , 7i> at the end of the proof of 
Theorem 14.51 are NLC-derived. □ 



Proposition 4.9 Let P be a finite subset o/N and let L be a subset ofQV(P). 
Then L is \/R-recognizable if and only if L is NLC-recognizable. 

Proof. The proof is a simple extension of the proof of Theorem 14. 51 

Since the operations of NLC are VR-derived, every VR-rccognizable subset 
of QV is NLC-recognizable. For the converse, we observe that the proof that (1) 
implies (3) in Theorem 14 . 51 can be modified to show that an NLC-recognizable 
set of QV is iS-recognizable. 

Again, we rely on Lemma \'2. 51 but now with T = NLC, S = QV, and Q, T, 
( and Ti. as in Theorem 14.51 

In order to justify the fact that the arguments used in the proof of Theo- 
rem ED are also valid with these assumptions, we refer to Example 14.81 Indeed 
this example shows two things: on one hand, the operation fg in Proposition ^. 41 
is in fact NLC-derived, so that the first hypothesis of Lemma f2. 51 is satisfied by 
this new choice of T and S. On the other hand the finitcness hypothesis of 
Lemma [2.41 is also satisfied with this new value of T = NLC. This completes 
the proof. □ 



4.4.2 Graphs whose port labels partition the vertex set 

In certain contexts, and in particular in the definition of the clique-width of a 
graph (see Remark 14. 1 II belowl . one needs to consider graphs with ports where 
port labels partition the vertex set. More precisely, for each set of port labels P, 
let QV 7r {P) be the set of elements of QV{P) such that each vertex is a port, and 
no vertex is both a p-port and a g-port for p ^ q. Let also QV^ = (QV^(P)). 

Note that QV* is preserved by the operations of the form ©, add p . 9 and 
ren p ^ g . These operations form the signature VR 77 , and QV* is a VR^-algcbra. 

Remark 4.10 The operation add P]9 is written a PtQ in ^Uj. □ 
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Remark 4.11 The clique-width of a finite graph G, denoted by cwd(G), is 
defined as the smallest cardinality of a set P such that G is the value of a 
(finite) VR^-term using a set P of port labels, see |19ll7|. 

For algorithmic applications [50], it is useful to have efficient recognition 
algorithms for classes of graphs of clique-width at most k. At the moment we 
only know that this problem is NP. It is polynomial for k < 3, see [7]. □ 

Proposition 4.12 Let L be a subset ofQP^(P). Then L is VR W -recognizable 
if and only if L is VR- recognizable. 

Proof. Since VR^ consists of operations in VR, every locally finite VR-cong- 
ruence on QV induces a locally finite VR^-congruence on QV* . In particular, if 
L is VR-recognizable, and hence is saturated by a locally finite VR-congruence 
on QP, then L is saturated by a locally finite VR^-congruence on QV 71 , and 
hence L is VR^-recognizable. 

To prove the converse, we first introduce the mapping a: QV — » QV^ defined 
as follows. If G E GV{P), then cr(G) is the graph in gP 7T (2 p ) with the same 
set of vertices and the same edge relation as G, and such that for each vertex 
v and each X C P, v is an X-port in er(G) if and only if X is the set of p G P 
such that v is a p-port in G. We say that a port label p is void in G if there are 
no p- ports in G. 

Now let us assume that L is VR^-recognizable, and let = be a locally finite 
congruence on QV n saturating it. If G, if G QV{P), we let G ~ H if a(G) and 
cr(H) have the same non-void port labels, and er(G) = cr(H). It is immediately 
verified that ~ is a locally finite equivalence relation. 

We now verify that - is a VR-congruence. If G G QV{P) and H G GV(Q), 
it is easily seen that a(G®H) = a(G)@a(H). If p,q G P, then cr(add Pl g(G)) = 
/(cr(G)) where / is the composition of the operations addx,Y for each X,Y <Z P 
such that p £ X and g G Y. Finally, one can verify that if D C P x Q, then 
er(mdf£>(G)) = g(cr(G)) where c/ is the composition of the operations ren^^y, 
where X C P, Y C Q and F = /^(W) = {q G Q | (p, g) G D for some p G P}. 

It is a routine task to derive from these observations the fact that ~ is a 
VR-congruence. We now need to verify that ~ saturates L. Let G G L and 
G ~ H . In particular, G G ^P^, so that the non-void port labels of ct(G) are 
exactly the sets {p} where p is a non-void port label of G. Since c(G) and cr(H) 
have the same non-void port labels, H is also in QV* . Moreover, if h is the 
composition of the operations ren{ p }^ p {p non-void in G), then G = h(o~(G)) 
and H = h(<r(H)). Since h is VR^-derived, it follows that G = H, and hence 
H G L. This concludes the proof. □ 

5 The algebra of graphs with sources 

Recall that we call graphs with sources the elements of StS of sort (E, G), where 
E = {edge} and G is some finite set of source labels, and that we write GS(C) 
for StS(E,C) (see Section 10). 
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5.1 The signature HR 



The disjoint union and the operations of the form srcren a ^b, srcfg a and fus a j, 
(defined in Example I3.4f> preserve graphs with sources. We denote by HR the 
signature consisting of all these operations, so QS is an HR-algebra. 
We note the following properties of H R-recognizability. 

Proposition 5.1 Let C be a finite set of source labels. Every S -recognizable 
subset of StS(E,C) is HR-recognizable. 

Proof. This is a simple consequence of Proposition 12 . II and of the observation 
given above that the operations of H R are also operations of S. □ 

Note that the class Graph of graphs, defined in Section Em is equal to GV{$) 
as well as to (?<S(0) = StS(E). Thus VR-recognizability and H R-recognizability 
are properties of subsets of Graph. 

Corollary 5.2 Let L be a set of graphs (a subset of Graph ). If L is VR- 

recognizable, then it is HR-recognizable. 

Proof. This follows immediately from Proposition 15 . 1 1 and Theorem 14. 51 □ 

Remark 5.3 Intuitively, the VR-operations are more powerful than the HR- 
operations (every H R-context-free set of simple graphs is VR-context-free but 
the converse is not true, Courcelle [0|), but the HR-operations are not among 
the VR-operations, nor are they derived from them. □ 

We will see in Sections 16. II and 16.21 sufficient conditions for HR-recognizable 
sets to be VR-recognizable, and in Sect ion l6~?fl examples of HR-recognizable sets 
which are not VR-recognizable. 

5.2 Variants of the algebra of graphs with sources 

We find in the literature a number of variants of the signature HR or of the 
algebra QS. We now discuss these different variants, to verify that they do not 
introduce artefacts from the point of view of recognizability. 

5.2.1 The signature HRy 

Let HRy denote the signature on QS obtained by substituting the parallel com- 
position || for © (see Section 13.5. 1|) . With the same proof as Proposition ^. 151 
we get the following result. 

Proposition 5.4 Let L be a subset of QS. Then L is HR-recognizable if and 
only if it is HRii -recognizable. 
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5.2.2 Source-separated graphs 

As in Section 13.5.21 we now discuss the class GS sep of source separated graphs. 
The operations of HR all preserve source separation, except for fus a .b, but we 
defined in Example 13. 161 the operation fus _,.& = srcfg a o fus a: b which does. Let 
H R sep be the signature on GS sep consisting of © and the qfd unary operations of 
the form srcren a _»b, srcfg a and fus a _>{,. 

Proposition 5.5 Let L be a subset of QS sep . Then L is HR-recognizable if and 
only if it is HR sep -recognizable. 

Proof. Since HR sep consists only of HR-derived operations, every HR-recog- 
nizable set subset of G<S sep is also HR sep -rccognizable. 

The proof of the converse is a variant of the proof of Theorem l3.18l First we 
note that the type relation ( (see Section l3~3*|) is also an H R-congruence on QS. 
We use the same mapping h defined in the proof of Theorem 13. 181 that maps a 
graph with sources S G QS{C) to a source-separated graph h(S) G GS sep (C) by 
splitting sources that were identified in S. We refer to that proof for notation 
used here. 

If L is an HR sep -recognizable subset of GS sep and = is a locally finite HR sep - 
congruence recognizing it, we define a relation ~ on QS as follows. If S, T G 
QS{C), we say that S ~ T if C{S) = C(T) and h(S) = h(T). As in the proof of 
Theorem 13. 181 ~ is easily seen to be a locally finite equivalence relation. It is 
also easily seen that ~ is preserved under the HR sep -operation ©. 

We now need to verify that if S ~ T G QS(C) and g is one of the unary 
operations of HR sep defined on GS(C), then g(S) ~ g(T). Again, Proposi- 
tionl3~31shows that ((g{((S))) = ((g(((T))) and we want to show that h(g(S)) = 
h(g(T)). The graphs S and T are fixed for the rest of this proof. We write ho, 
C and Ci for h$, Cq and Cf . 

As in the proof of Thcorcm l3.18l it suffices to construct an H R sep -derived op- 
eration k, depending on g and C(S), such that h(g(S)) = k(h(S)) and h(g(T)) = 
k(h(T)). There is no reason why the operation k constructed in the proof of 
Theorem 13.181 should be HR sep -derived, but the operations g considered here, 
namely srcren Q _>b, srcfg a and fus a ->b are simple enough that we can directly 
construct a suitable k in each case. 

If g = srcren a ^f, Then g is defined on QS(C) (where a £ C and b ^ C) and 
its range is QS{C \ {a} U {b}). One verifies that /i(srcren a ^b(5')) is equal to: 

• srcren a _ > h(/i(S')) if a S C\ and b > ho(a); 

• srcren a ^ Ma) (srcren Ma) _ +b (/i(S'))) if a € C x and b < h Q (a); 

• srcren a ^b(/i(S')) if a G Co and b < c for every c e C\ such that ho(c) = a; 

• srcren c ^b(srcren;,_ >c (ft.(S'))) if a G Co and b > c = mm{d G C\ \ ho(d) = a. 
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If g = srcfg a Then g is defined on QS(C) (where a G C) and its range is 
QS{C \ a). One verifies that h(srdg a (S)) is equal to: 

• fus ^ ho(o )(/i(S)) if a G Ci; 

• fuSa-^c if a G Co, /io _1 ( a ) 7^ and c = min{/io _1 (a)}; 

• srcfg a (/i(5)) if a G C and /i _1 (a) = 0- 

If g = fus a ^b Then g is defined on QS(C) (where a ^ b G C) and its range is 
QS(C \ a). One verifies that /i(fus a ^f,(5)) is equal to: 

• srcren a ^ ho ( )(fus /lo ( )^ feo (;,)(/i(5))) if a G Ci and h (&) < h Q (a); 

• srcren a ^ ft , o(i) )(fus ho (j ) )^/ lo ( )(/i(5))) if a G Ci and /i (6) > h (a); 

• srcren a ^ ho(o )(/i(5)) if a G Ci and /i (k) = h (a); 

• fus a ^ ho (p)(h(S)) if a G C and a > h (b); 

• srcren a ^ c (fus/ l0 ( b )^ a (/i(S'))) if a G Co, and c = min{/i (&), /i ~ 1 (a)}, and 
a < h (b); 

• srcren a ^ c (fus a _ >c (/i(5'))) if a G Co, a = h n (b) and c = min{d G C\ 
h (d) = a}. 

This concludes the proof. □ 

Again with the same proof as for Proposition 13 . 1 51 we can show that the 
operation © can be replaced by || in the signature HR sep - yielding the signature 
HR sep .||. 

Proposition 5.6 Let L be a subset of QS sep . Then L is HR sep -recognizable if 
and only if it is HR sep || -recognizable. 

5.2.3 Other variants 

The equivalence between HR sep n- and HR||-rccognizability for a set of source- 
separated graphs - a consequence of Propositions 15 . 41 l5~51 and - was already 
established by Courcelle in ^U] for graphs with multi-edges (see Section 0). In 
the same paper, Courcelle established the equivalence between HR sep - and B- 
recognizability for several variants B of the signature H R, which we now describe. 
We refer to ^U] f° r the proofs. 

For each finite set C of source labels, let srcfg aW be the composition of the 
operations srcfg c for each c G C (in any order). Let also O c be the follow- 
ing binary operation on GS sep , of type (C, C) — > 0: if G, H G QS sep {C), then 
GOqH = srcfg aH (G || H): G D c H is obtained by first taking the parallel 
composition G \\ H, and then forgetting all source labels. 

Let CS be the signature on QS sepi which consists only of the Oq operations. 
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Let HR fg be the derived signature of HR||, which consists of the operations 
srcfg a „ and || . 

Let HR ren be the subsignaturc of HR||, which consists of the operations 
srcren p ^ 9 and ||. 

Let HR^p be the subsignaturc of HR sep ||, which consists of the operations || 
and those operations srcrerip^ which preserve source separation. 
The following result is a compilation of Section 4]. 

Proposition 5.7 If L C QS, then L is HR-recognizable if and only if L is 
H R ren -recognizable. 

If L C QS sep , then L is HR sep -recognizable if and only if L is HR'^-recognizable. 
If L C Graph, the following are equivalent: 

• L is HR-recognizable; 

• L is CS -recognizable; 

• L is HR fg -recognizable. 

Remark 5.8 The notation CS refers to the notion of fully cutset-regular sets 
of graphs, introduced by Abrahamson and Fellows [I]. Full cutset-regularity is 
equivalent to CiS-rccognizability. □ 



In , Courcelle also shows a number of closure properties of the class of 
HR S ep-recognizable sets of source-separated graphs with sources. In particular, 
it is shown that this class contains all singletons and it is closed under the 
operations of HR 5ep ^3 Section 6]. 

Finally Courcelle shows the following result |1U1 Theorem 6.7]. 

Proposition 5.9 Let L 6 QS(C). Then L is HR-recognizable if and only if 
srcfg aH (L) is HR-recognizable. 

6 Finiteness conditions ensuring that HR- and 
VR-recognizability coincide 

We saw that a VR-rccognizablc set of graphs is always HR-recognizable (Corol- 
larv l5.2|) . The converse does not hold in general, as we discuss in Sect ion 1531 We 
first explore structural conditions on graphs, which are sufficient to guarantee 
that an H R- recognizable set of graphs is also VR- recognizable. 

Let K n ,n be the directed complete bipartite graph with n + n vertices. A 
directed graph G G Graph is without ~ft n ,n if it has no subgraph isomorphic to 
I?n,n- The main result in this section is the following. 

Theorem 6.1 Let n be an integer. An HR-recognizable set of graphs without 
I?n,n is \IR-recognizable. 
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This theorem is proved in Section lfi~Tl and some of its corollaries are discussed 
in Section IB~2l 

Note that results similar to Corollary 15 . 21 and Theorem 16 . 1 1 hold for VR- and 
H R-cquational sets of graphs. As explained in the introduction, such sets are 
exactly the context-free sets of graphs, formally specified in terms of recursive 
sets of equations using the operations of VR and HR respectively. Specifically, 
the following results are known to hold: 

• every H R-equational set of simple directed graphs is VR-equational (Cour- 
celle 

• if a VR-equational set of directed graphs is without T? n ,n for some n, then 
it is H R-equational (by the main theorem in Courcelle Jl] and Lemma KH)! 
below) . 

Thus the same combinatorial condition is sufficient to guarantee the equiv- 
alence between VR- and HR-recognizability, as well as between VR- and HR- 
equationality. A further similar result concerning monadic second-order defin- 
ability and using a stronger combinatorial property will be discussed in Sec- 
tion EH 

6.1 Proof of Theorem 16.11 

We first record the following observation. 

Lemma 6.2 Let G be a directed graph and let x, y be two vertices of G that are 
not adjacent, and such that there is no vertex z such that both (x, z) and (y, z) 
(resp. both (z,x) and (z,y)) are edges. Let H be obtained from G by identifying 
x and y. Lf G contains ~L^ m ,m o,s a subgraph, then so does H. 

Proof. Let K be a subgraph of G isomorphic to Tl m ^ m . From the hypothesis, 
the vertices x and y are not both in K . It follows that K is still isomorphic to 
a subgraph of H. □ 

The proof of Theorem 16.11 will proceed as follows. We consider an HR- 
recognizable set L of finite graphs without ~K n ,n and we denote by m the largest 
integer such that Tl m ,m is a subgraph of a graph in L. Such an integer exists 
by hypothesis. 

Since we are talking about source-less graphs, the set L is H R sep -recognizable 
by Proposition 15.51 and we consider a locally finite HR sep -congruence = satu- 
rating L. We will define a locally finite NLC-congruence ~ on QV that also 
saturates L. By Proposition ^. 91 this suffices to show that L is VR-recognizable. 
The definition of ~ makes use of the notion of expansion of a graph, defined 
below. 

Note that the following definitions depend on the integer m, even though 
terminology and notation do not make this dependence explicit. 
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Small and large port labels and formulas Let G £ QV{P) be a graph with 
ports. If p £ P, we denote by pa the set of p-ports of G. We say that a port label 
p is void in G if pa is empty, we say that p is small in G if 1 < card(pc) < m 
and that p is large in G if card(pc) > m - 

Observe that if the port labels p and q are both large in G, then add pg (G) 
contains iv TO -|_i jTO -|_i as a subgraph. 

Moreover, if p is large in G, if ri , . . . , are small in G, let 

H = add Piri add Pjr . 2 • • • add Pir(c (G). 

For i = 1, . . . , k, let Rj = card(r iG ). If iJ does not contain A* m+lr , i+1 , then we 
must have fix + • ■ ■ + < m. If G already contains edges from the p-ports to 
other vertices, then n\ + • • • +n.fe < to. The notion of expansion below will make 
it possible to handle this sort of complicated situation (see Example 16 . 31 below) . 

Let us say that a closed first-order formula is small if it has quantifier-depth 
at most 2to + 2. Note that the existence of a subgraph isomorphic to A m +i,m+i 
can be expressed by a first-order formula of quantifier-depth 2m + 2. 

Expansions We will define supergraphs of G £ QV(P) called expansions, that 
contain information relevant to the distribution of small and large port labels, 
and where ports are represented by sources. Furthermore, it will be possible 
to simulate an N LC-operation on G that does not create l? m +i. m +i subgraphs 
by H R-opcrations on expansions of G. These expansions will then be used to 
transform the HR sep -congruence = into an N LC-congruence ~. 

Furthermore, we will define ~ in such a way that two equivalent graphs 
satisfy the same small first-order formulas. 

Wc now give formal definitions. For each port label p, we define a set G(p) 
of source labels, 

C{p) = {in(p, i), out(p, i), s(p, i) \ 1 < i < to}. 

If P is a set of port labels, G(P) denotes the union of the C(p), for p in P. 

Let G £ QV(P) be a graph with ports, let G C G(P), and let G be a graph 
in QS sep (C). We say that G is an expansion of G if the following conditions 
hold: 

(1) G has no subgraph isomorphic to i? m +i !m +i. 

(2) Except for the labeling of ports and sources, G is a subgraph of G. The 
sources of G, and its vertices and edges not in G, are specified by Condi- 
tions (3) and (4). 

(3) If p is small in G, then each p-port of G is an s(p, «)-source of G for some 
integer i < m. Different p-ports are of course labelled by different source 
labels. There are no in(p,j)- or out(p, j)-sources. 

(4) If p is large in G, then there may be vertices of G that are not in G, with 
source labels of the form in(p,i) or out(p,i) for some i < m. Moreover, 
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there is an edge in G from each vertex of pa to each in(p, i)-source, and 
from each out(p, i)-source to each vertex in pq. There are no s(p,j)- 
sources. 

In particular, G may have several different expansions, but it has only a 
finite number of expansions (up to isomorphism). This number is bounded by a 
function depending on m and the cardinality of P. Indeed, for each small port 
label p, there is only a bounded number of ways to make p-ports into s(p, i)- 
sources (see (3)), and for each large port label p, there is a bounded number of 
ways to create in(p,i)- and out{p, i)-sources (see (4)). 

Example 6.3 Let m = 2, and let G be a graph with port labels p, q, r. Suppose 
that G has 4 p-ports, 2 g-ports and 1 r-port, so that p is large, and q, r are 
small in G, see Figure ^ Then in any expansion of G, every q- and r-port 
will be a source, say labeled by s(q,l), s(q,2) and s(r, 2) (there is only one 
s(r, i)-source, but it is not required that these sources should be labeled with 
consecutive numbers starting at 1). 

out(p, 2) in(p, 1) in(p, 2) 





P\ IP 


'I 


s{q,l) 






'I 


«(9,2) 




X 










r 


s(r,2) 


G 









Figure 1: If is an expansion of G 

Moreover, an expansion of G may have up to two new vertices that are 
in(p, j)-sources, and at most one out(p, j)-source. Say, an expansion H could 
have new vertices as in(p, 1)- and in(p, 2)-sources, with edges from each of the 
4 p-ports to each in(p, j)-source; and it could have a new vertex as a, say, 
out(p, 2)-source, with edges from that vertex to each of the p-ports. 

Note that if G has a vertex x with an edge from x to at least 3 p-sources, 
then an expansion cannot have 2 out(p, j)-sources: otherwise it would contain 
a copy of i?3 3, which is not allowed for an expansion. □ 

Remark 6.4 It is not always the case that G is determined by each of its 
expansions G. If p is large in G but G has no in(p, i)- or out(p, i)-sources, then 
it is not possible to determine which of its vertices are p-ports. □ 
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Construction of an NLC-congruence from an HR sep -congruence Let = 

be a locally finite H R sep -congruence saturating L. We define a relation ~ on QV 
as follows. For G and G' in GT{P) we let G ~ G' if and only if 

(a) either G and G' both contain A m _|_i im _|_i as a subgraph, or neither does 
and in that case, the following two conditions hold: 

(b) G and G' satisfy the same small first-order formulas (i.e., with quantifier- 
depth at most 2m + 2) on graphs with ports. 

(c) for every expansion G of G, there exists an expansion G' of G' such that 
G = G' and G and G' satisfy the same small first-order formulas on 
graphs with sources (we say that G and G' are equivalent expansions); 
and conversely, for every expansion G' of G' there exists an expansion G 
of G equivalent to G' . 

Note that Condition (b) implies that G and G' have the same void, small 
and large port labels, and Condition (c) implies that G and G' have the same 
source labels. 

The relation ~ is clearly an equivalence relation on each set QV(P). It has 
finitely many classes on each QV(P) since a finite graph has a uniformly bounded 
number of expansions (up to isomorphism), the H R sep -congruence = is locally 
finite, and there are finitely many first-order formulas of each quantifier-depth 
on graphs with sources in a subset of C(P). 

Now a graph without ports and without K m +i, m +i has a unique expansion: 
itself. It follows that, for graphs without ports and without T£ m +im+ii the 
equivalences = and ~ coincide. In particular, ~ saturates L since = does. 

It remains to prove that ~ is an NLC-congruence. Recall that the signature 
NLC consists of the operations of the form fg p , ren p _^ g and ®j. 

The port forgetting operation We first consider the operation fg p . We 
consider G, G' with G ~ G' and we want to prove that H ~ H', where H = 
fg p (G) and H' = %(G'). 

First of all, the underlying graphs of G and H (resp. G' and H') are identical, 
so that G and G' contain i? m +i im _|_i if and only if so do H and H' . If this is 
the case, then G ~ G' and H ~ H' . We now exclude this case and assume that 
G and G' are without K m +i,m-i-i- Note also that if p is void in G, then it is in 
G' as well, and we have H = G, H' = G', so that H ~ H'. We now assume 
that p is not void in G. 

It is an immediate consequence of Theorem 13 . 1 21 that H and H' satisfy the 
same small first-order formulas on graphs with ports, so Condition (b) is verified. 

We now consider Condition (c) . Let H be an expansion of H. We will show 
that there exists an expansion G of G and a unary HR sep -term t such that H = 
t(G). Since G ~ G', there exists an equivalent expansion G' of G', and t(G') will 
be the desired expansion of H'. Using the fact that = is an HR sep -congruence 
and Theorem 13. 121 we will have H ~ H' as expected. 
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If p is large in G, the situation is particularly simple: H is also an expansion 
of G, so we can choose t to represent the identity. If G' is an expansion of G' , 
equivalent to H, then G' does not use source labels of the form s(p, i), in(p, i) 
or out(p, i), so G' is also an expansion of H' . 

If p is small in G, let G be a graph with source obtained from H by letting 
each p-port of G be an s(p, i)-source (where distinct source labels are used for 
distinct p-ports). Then G is an expansion of G, and H = t(G) where t is the 
composition of the operations srcfg s / p i % (1 < i < in). Using the definition of ~, 
there exists an expansion G' of G' which is equivalent to G, and we only need 
to verify that H' = i(G') is an expansion of H'. The only point to check here 
is the fact that H' is a subgraph of H': this follows from the facts that G is a 
subgraph of G and the operations t and fg p do not change the underlying graph 
structures. 

The renaming operation We now consider the operation ren p ^, q . Let G, G' 
with G ~ G'\ as with the port forgetting operation, we want to prove that 
H ~ H' where H = ren p - iq (G) and H' = ren p ^ q (G'). As above, we can reduce 
the proof to the case where neither G nor G 1 contains 7? m +i.m+i: & n d where 
p is not void in G (if p is void in G, then H = G and H' = G'). Moreover, 
Condition (b) follows from Theorem 13. 121 

We consider Condition (c), following the same strategy as above. Let H be 
an expansion of H. 

If q is void in G, then the transformation ren p ^ g is a reversible renaming, 
that is, G = ren q ^ p (H). Moreover, if t is the composition of the operations 
of the form srcren s ( Pi4 )^ s(?il ) , srcren in ( P: j)_ >m(gil ) and srcren „ t ( Pii )_ (cnit ( gii ) , and 
if t' is the composition of the operations srcren s ( g)i -)^ s ( p ^), srcren in ^ >m(2,i) 
and srcren out ( 5 ,i)^out{p ,%) > then G = t'(H) is an expansion of G, H = t{G). 
Moreover, if G' is an expansion of G', equivalent to G, then H' = t(G') is an 
expansion of H'. 

We now assume that q is not void in G. We need to consider several cases. 

Case 1. p and q are both large in G. Then p is void and q is large in H . 

In order to build the desired G, we split each in(q, i)-source of H into an 
in(p, i)-source and an m(g, i)-source. The in(p, i)-sourcc is linked by incoming 
edges to all p-ports of G, and the in(q, i)-source is linked similarly to all g-ports. 
In the same fashion, we split each out(q, i)-source of H into an out(p, i)-sourcc 
and an out(q, i)-source linked by ougoing edges to all p-ports of G and to all 
g-ports respectively. The term t such that H = t(G) is the composition of the 
operations fus in ( p ^^ m ( q ^ and fus out ( p ^^ out ( q ^ . 

The graph G does not contain ^ m +i,m+i) since H does not (by Lemma 
Hence G is an expansion of G. Let now G' be an expansion of G' equivalent to 
G, and let H' = t(G'). It is easily verified that H' is an expansion of H', and 
as above, it follows that H ~ H 1 '. 

Case 2. p is small and q is large in G. 
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In order to build G from H, we make the p-ports of G into s(p, i)-sources, we 
delete the edges between the in(q, i)- and the out(q, i)-sources and the p-ports 
of G. The term t which must do the opposite (that is, construct H from G) is a 
composition of source forgetting operations and of additions of new edges. More 
precisely, for each i, j such that s(p, i) and in(q,j) are source labels in G, we use 
the operation Z i — ► Z © (a — > to), where (a — > ui) is the 2- vertex, 2-source, 
1-cdgc graph, followed by the operations fus a ^ s ( p ^ and fus (il _ > j n ( gj -) . We then 
apply similar operations to create edges from the out(q, j)- to the s(p, i)-sources. 
And we finally apply the operations srcfg s ( pi ). 

The graph G is a subgraph of H (up to source labels), so G does not contain 
i?m+i,m+i; an d hence it is an expansion of G. The proof continues as in the 
previous case. 

Case 3. q is small and p is large in G. 

To build G from H, we make the g-ports of G into s(q, i)-sources, we delete 
the edges between the in(p, i)-sources or the out(p, i)-sources and the g-ports 
of G. In addition we rename each in(p, z)-source to an in(q, i)-source, and each 
out{p, i)-source to an out{q, i)-source. We can use the same reasoning as in Case 
2 to conclude in this case. 

Case 4. p and q are small in G, and card(pc) + car d(</G) < m - 

To build G from H, we rename s(q, i) into s(p, i) whenever the s(q, i)-source 
of H is a p-port in G. The term t which does the opposite is a composition 
of source renamings. The graph G does not contain K m +i im +i, otherwise H 
would do, since G is equal to H up to source labels, and hence G is an expansion 
of G. The other parts of the proof are the same. 

Case 5. p and q are small in G, and card(pc) + car d((/G) > m + 1. 

To build G from H, we make the p-ports (resp. g-ports) of G into s(p,i)- 
sources (resp. s(g, z)-sources), we delete the edges between the in(q,i)- and 
out(q, «)-sources and the p- and g-ports of G, and we delete the in(q, i)- and 
out(q, i)-sources. The term t which does the opposite is a composition of addi- 
tions of new edges and of srefg operations, as in Case 2, see Figurc|21 The graph 
G does not contain K m+ \^ m+ \, otherwise H would too, since G is a subgraph of 
H (up to source labels), and hence G is an expansion of G. The proof continues 
as in the previous cases. 

This concludes the proof that G ~ G' implies ren p ^ g (G) ~ renp^ 9 (G"). 

The operation <K>,/ We now consider the operation (g>j where J C (P x Q) U 
(Q x P), P and Q are disjoint. Let G - G' in QV(P), K - K' in QV{Q), 
H = G ®j K and H' = G' ® j K' . We want to prove that H ~ H' . 

We first consider the very special case where J = 0, and the operation <g)j 
is simply the disjoint union. Then H contains i? m +i im +i if and only if G or A' 
does, if and only if G' or K' does, if and only if H' does. 

Asuming that H does not contain A m+ i iin+ i, an application of Theorem l3 . 1 2l 
ensures, as for the operations of port forgetting or renaming that H and H' sat- 
isfy the same small first-order formulas. 
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We now consider an expansion H of H. It is necessarily of the form H = 
G © A where G and A" are expansions of G and K respectively. Then there 
exist expansions G' and K' of G' and AT' respectively, which are equivalent to 
G and K. One then verifies that W = G' © A'' is an expansion of which is 
equivalent to H. 

Next we assume that J is a singleton, J = {(p, g)}, that is, G ®j A' = 
add p , 9 (G © A') with peP and q € Q. 

Since G and G' on one hand, and A and A' on the other satisfy the same 
small first-order formulas, Theorem l3. 12l shows that H = add Pi9 (G© A) contains 
A* m+ i im+ i if and only if H' = add p , ? (G' © A') does. Assume now this is not 
the case and consider an expansion H of H. 

Again there are several cases. Note that p and q cannot both be large in 
G and A respectively. We claim that H can defined as t(G, A) where t is an 
HRsep-term, G is an expansion of G and A is an expansion of A. As for the 
other operations, we consider expansions G' and K' of G 1 and A', equivalent to 
G and A. Although it is a bit tedious, we verify formally that H' = t(G' © A') 
is an expansion of H 1 . It follows that H' is equivalent to H, and hence H ~ H' '. 

Case 1. p is large in G and q is small in A. 

Then if has edges from all p-ports of G to all g-ports of A, which are actually 
s(q, i)-sources in H. For each of these s(<?, i)-sources, say x, we create a new 
vertex x' , and each edge coming from G to x is redirected towards x'. We make 
x' into an in(p, j)-source (for some appropriate j) of the expansion G of G we are 
constructing. The desired expansion K of K is just the subgraph of H induced 



39 



by the set of vertices of K. And G consists of the subgraph of H induced by 
the vertices of G together with x' and all these redirected edges. Then the 
HRsep-term t needs only to fuse in G ffi K the above described in(p, j)-sources 
with the corresponding s(q, i)-sources. This can be done by a combination of 
the operation ffi and those of the form fus in ( p j^ s ( q ^ . The only point to check 
is that G docs not contain ~Km+i,m+i- We can apply Lemma lb . 21 because H is 
obtained from G ffi K by fusions of pairs of vertices which are not adjacent and 
have no incoming edges with the same source (because G and K are disjoint) 
and no outgoing edge at all. 

Then there exist expansions G' and K' of G' and K' respectively, equivalent 
to G and K. By letting H' = t(G',K'), we get the desired expansion of H', 
equivalent to H. 

This case is illustrated in Figure |3 where m = 3 and N is the constructed 
expansion of G ©j K. 




G®jK = add p>q (G®K) 




Case 2. p is small in G and q is large in K. 

It is fully similar to the first case, creating new out(q, j')-sources instead of 
in(p, j)-sources. We omit the details. 

Case 3. p is small in G and q is small in K. 

Let G be the subgraph with sources of H consisting of the vertices of G, 
and let K be defined similarly in terms of K. Then H is obtained from G ffi K 
by the addition of edges from each s(p, i)-source of G to each s(q, j)-source of 
A', which can be done by an HR sep -term (see Case 2 of the discussion of the 
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renaming operation). Since G and K are subgraphs of H, they cannot contain 
i?m+i,m+i & n d hence, they are in fact expansions of G and K as desired. The 
proof continues as above. 

Case 4. p is void in G or q is void in if. 

Then add PiI . acts as the identity on G © if, so (g>j acts as © on (G, if) and 
we are back to a previously studied case. Recall that if p (resp. q) is void in G 
(rcsp. K), then it is void in every ^-equivalent graph with source. 

This concludes the study of the case where J is a singleton in P x Q. The 
case where J is a singleton in Q x P is of course similar. 

The proof is actually the same in the general case where J is not a singleton. 
We need only do the same constructions for all elements (p, q) in J. The only 
possible difficulty could arise from the use of Lemma 16.21 to verify that the 
graphs G and K obtained from H by the creation of vertices (like x' in Case 1 
above) and the redirection of edges do not contain i? m +i. m +i, and hence are 
expansions. Thus let us consider the transformation of G© K into H. It consists 
in a sequence of fusions of pairs of vertices. Whenever we fuse an in(p, i)-source 
of G, say x, with an s(q, j)-source of K. say y, we must verify that the fusions 
performed previously keep the hypothesis of Lemma IfOl valid . It is clear that x 
and y are not adjacent, since x is adjacent with vertices of G only. Because of 
previous fusions, there may exist an edge from some z in G to y. However, this 
edge comes from a previously applied operation add p / i9 with p' ^ p. It follows 
that there is no edge from z to x. An analogous argument also applies to fusions 
between an out(p, i)-source of G and an s(q, j)-source of K. and also when we 
exchange the roles of G and K. Hence, finally, we can apply Lemma 16.21 to 
deduce that G and K do not contain ~K m +i,m+i because H docs not. Hence, 
they are expansions of G and K , as we needed to check. 

This concludes the proof of Theorem 16. II 
6.2 Other finiteness conditions 

We now consider some consequences of Theorem l6.ll Let K n n be the undirected 
complete bipartite graph with n + n vertices, that is, K nn is the undirected 
graph underlying K n , n . We say that a (directed) graph is without K n n if its 
undirected underlying graph has no subgraph isomorphic to K n n . 

We say that a graph G is uniformly k-sparse if card(E(H)) < k card(V(H)) 
for every finite subgraph H of G, where V(H) and E(H) are the sets of vertices 
and edges of H. A set of graphs is uniformly k-sparse if each of its elements is. 

Proposition 6.5 Let L C Graph be a set of graphs, satisfying one of the fol- 
lowing properties: 

L is without K 

it, n 

for some n 
or L is without K n n for some n 
or L consists only of planar graphs 
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or L is uniformly k-sparse for some k 

or L consists only of graphs of tree-width at most k for some k. 

Then L is Y\R-recognizable if and only if L is VR-recognizable. 

Proof. By Corollary 15.21 it is always the case that a VR-recognizable set of 
graphs is HR-recognizable. 

If L is without K n>n for some n, the converse implication was proved in 
Theorem lfi.il Lemma 16.61 below shows that L is without K p . p for some p if and 
only if it is without ~zt n ,n for some n. 

It is well-known that planar graphs are without ^3,3 (planarity is a property 
of the underlying undirected graph, and K33 is the undirected graph underlying 
^3,3). It follows that planar graphs are also without ^3, and the result follows 
from Theorem 16.11 

It is easily seen that K 2fc+i,2fe+i is not fc-sparse. So if L is uniformly fc-sparse, 
then it is without i?2fc+i,2fc+i- 

Finally, it is known that graphs of tree- width at most k are uniformly (fc + 1)- 
sparse (see for instance ^E]), which yields the last assertion. □ 

Lemma 6.6 Let p be an integer. There exists an integer n such that a directed 
graph without K PlP , is without K n ^ n . 

Proof. We use the particular case of Ramsey's Theorem for bipartite graphs, 
given as Theorem 1 in [23 P- 95]- It states that for each p, there exists an 
integer n such that, if the edges of K n ^ n are partitioned into two sets A and B, 
then either A or B contains the edges of a subgraph isomorphic to K PiP . 

So let us assume that U, W C V(G), where U and W are disjoint sets of n 
elements and there is an edge between u and w (in one or both directions) for 
each (u, w) <E U x W. Let A be the set of pairs (u,w) <E U x W such that the 
edge is from u to w, and let B = (U x W) \ A. Then there exist sets U' C U 
and W C W, with cardinality p, such that U' x W C A or W x U' C B. In 
either case, we get a subgraph of G isomorphic to ^ p , p . 

Note hat a quick and direct proof can be given with n = p2 2p ', but we do 
not know the minimal n yielding the result. □ 

Remark 6.7 The statement relative to bounded tree-width sets of graphs in 
Proposition 16. 51 is also a consequence (in the case of finite graphs) of Lapoire's 
result which states that, in a graph of tree-width at most k, one can con- 
struct a width-fc tree-decomposition by monadic second-order (MSO) formulas. 
This can be used to show that every H R-recognizable set of graphs of bounded 
tree-width is definable in Counting Monadic Second-order (CMSO) logic, using 
edge set quantifications. Courcelle showed [H\ that, for finite graphs of bounded 
tree- width, edge set quantifications can be replaced by vertex set quantifications. 
The considered set is therefore definable in CMSO logic with vertex set quan- 
tifications only, and hence is VR-recognizable by another of Courcelle's results 

M □ 
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Remark 6.8 It is proved in [5j that every set of square grids is HR-recognizable. 
It follows from Theorem 16 . II that every such set is also VR-recognizable. Hence, 
there are uncountably many VR-rccognizable sets of graphs, so we cannot hope 
for an automata-theoretic or a logical characterization of VR-rccognizability — 
in contrast with the situation prevailing for words, trees and some special classes 
of graphs, see 03 EH El EllH EOj. □ 

6.3 HR-recognizable sets which are not VR-recognizable 

The aim of this short section is to establish the existence of H R- recognizable 
sets which are not VR-recognizable. We first establish a lemma. 

Lemma 6.9 Every set of cliques (of the form K n , n > 1) is HR-recognizable. 

Proof. Let L be a set of undirected cliques (recall that an undirected graph is 
a graph where the edge relation is symmetric). We provide a locally finite CS- 
congruence on QS sep which saturates L (see Section 15. 2 .3[1 . By Proposition 15. 71 
this establishes that L is HR-recognizable. 

For each finite set C of source labels, let G l (C) be the set of graphs in 
GS sep (C) having at least one internal vertex (i.e., a vertex which is not a source), 
and let G S (C) be the set of graphs in QS sep (C), in which every vertex is a source. 
In particular, G S {C) is finite. 

Let = be the following equivalence relation on GS sep . We use the operation 
□c, as in Section l5~2~31 If G,G' G GS sep {C), wc let G = G' if and only if 
either G = G' , or G,G' G G l {C) and for every H G G S (C), G D c H G L iff 
G'a c He L. 

Note that for each C, there are only finitely many =-classes in GS sep (C), — 
namely at most p + 2 P , where p is the cardinality of G S (C). 

Moreover, = saturates L. Indeed, suppose that G,G' G QS sep (C), G = G' 
and G G L. Let H be the graph in QS ssp {C) consisting of distinct c-sources 
(c G C) and no edges. Then we have G = GU C H and G' = G' U c H. It follows 
from the definition of = that G' G L. 

Finally, we check that = is a CiS-congruence. Let G,G' , H, H' G QS sep (C), 
with G = G' and H = H': we want to show that G U c H = G' U c H' . Wc 
observe that if both G and H have internal vertices, then G O c H is not a clique 
(by definition of operation O c ), and hence cannot be in L. The rest of the proof 
is a straightforward verification. □ 

We can now prove the following. 

Proposition 6.10 There is an HR-recognizable set of graphs which is not VR- 
recognizable. 

Proof. Let A be a set of integers which is not recognizable in (N, succ, 0), for 
instance the set of prime numbers, and let L be the set of cliques K n for n G A. 
Then L is H R- recognizable by Lemma |6~U1 
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We now consider a set of VR-terms describing L and using exactly 2 port 
labels, p and q. Recall that p denotes the VR-constant of type {p}, that is, the 
graph with a single vertex that is a p-port and no edges. The constant q is 
defined similarly. Now let k\ — p, and fc„+i = ren g ^ p addp. g add giP (fc n © q). It is 
not difficult to verify that k n denotes the clique K n where all the vertices are p- 
ports, K n itself is denoted by the term mdf0fc„, and the set K of all VR-terms of 
the form k n is recognizable (as a set of terms, or trees). If L is VR-rccognizable, 
then the set of VR-terms in K that denote graphs in L is recognizable. This 
set consists of all the terms of the form mdf$k n with n e A, and it can be 
shown by standard methods that it is not recognizable. It follows that L is not 
VR-rccognizable. □ 

6.4 Sparse graphs and monadic second-order logic 

Since graphs are relational structures, logical formulas can be used to specify 
sets of graphs. Monadic second-order logic is especially interesting because 

every monadic second-order definable set of finite graphs is VR-recognizable 
(Courcelle \^\T^). 

There is actually a version of monadic second-order logic allowing quantifi- 
cations on edges and sets of edges (one replaces the graph under consideration 
by its incidence graph; we omit details). We say that a set is M^-definable 
if it is definable by a monadic second-order formula with edge and edge set 
quantifications, and that we use the phrase MSi-definable to refer to the first 
notion. It is immediately verified (from the definition) that 

Every M S\-dehnable set is M S-2,-definable. 
The two following statements are more difficult. 

Every M S 2- definable set of simple graphs is H R-rccognizable (Courcelle (HfJ- 

If a set of simple graphs is uniformly k-sparse for some k and M S2-dehnable, 
then it is M S\-dehnable (Courcelle j!6|). 

This is somewhat analogous to the situation of Theorem 16.11 (see Proposi- 
tion [(OJ). However the combinatorial conditions are different: if a set of graphs 
is uniformly fc-sparse for some k, it is without K ttt for some t, but the converse 
does not hold. It is proved in the book by Bollobas [2] that, for each t > 2, 
there is a number a such that for each n, there is a graph with n vertices and 
an b edges that does not contain Kt t t, where b = 2t/(t + 1). For these graphs, 
the number of edges is not linearly bounded in terms of the number of vertices, 
so they are not uniformly fc-sparse for any k. 

It is not clear how to extend Courcelle's proof in ^H], to use the condition 
without K t j instead of uniformly k-sparse. 

7 Simple graphs vs multi-graphs 

The formal setting of relational structures is very convenient to deal with simple 
graphs, as we have seen already. It can also be used to formalize multi-graphs 
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(i.e., graphs with multiple edges), if we consider two-sorted relational structures. 

Formally, a multi-graph with sources in C is a structure of the form G = 
(V, E, inc, {ca)cec) where V is the set of vertices, E is the set of edges, each cq 
is an element of V, and inc is a ternary relation of type E xV xV. Wc interpret 
the relation inc(e, x, y) to mean that e is an edge from vertex x to vertex y. Wc 
denote by QS m (C) the set of multi-graphs with sources in C. As in the study 
of StS or QS, we assume that the finite sets of source labels C are taken in a 
fixed countable set. We let QS m be the union of the GS m (C) for all finite sets 
C of source labels. 

Graphs and hypergraphs with multiple edges and hyperedges are often used, 
see the volume edited by Rozenberg 0Tj. In this context, it is in fact frequent to 
consider operations on multi-graphs that are very similar to the HR-operations 
on QS. More precisely, the operations of disjoint union, source renaming, 
source forgetting and source fusion can be defined naturally on multigraphs 
with sources: thus QS m can be seen naturally as an HR-algebra. 

It is clear that each simple graph in QS(C) can be considered as an element 
in QS m (C). It is important to note however that the HR-opcrations on GS m , 
when applied to such simple graphs, do not necessarily yield the same result as 
in QS. For instance, let a, b be distinct elements of C, and let G G QS(C) be 
a simple graph. The action of fusing the a-source and the 6-source of G may 
now result in multiple edges: if there were arrows in both directions between 
da and be, or if there were arrows to (resp. from) a vertex of G from (resp. to) 
both aa and 6g- In contrast, the same operation in QS(C) yields fus aj &(G), an 
element of QS(C) by definition. To avoid confusion, we will denote by mfus ai t, 
this operation when used in QS m . 

Fortunately, we do not have this sort of problem with the other operations: 
applying the operations of disjoint union, source renaming or source forgetting to 
simple graphs considered as elements of QS m yields the same result as applying 
the same operations within the algebra QS. 

We let HR m be the signature on QS m consisting of the operations of the form 
©, srcfg Q , srcren a ^ b and mfus af) . Thus, QS m is an HR m -algebra. We observe 
that, as a signature (that is, as a set of symbols denoting operations), HR m 
is in natural bijection with HR. So we don't really need to introduce the new 
notation HR m , and we could very well say that QS m is an HR-algebra. We 
simply hope, by introducing this notation, to clarify our comparative study of 
recognizable subsets in the algebras QS and QS m . This distinction will be useful 
in the proofs of Theorems 17.31 and 17.41 

To summarize and amplify the above remarks, let us introduce the following 
notation. Wc denote by i: QS — ► QS m the natural injection. For each multi- 
graph G, we denote by u(G) the simple graph obtained from G by fusing multiple 
edges (with identical origin and end): that is, u is a mapping from QS rn onto 
QS. Elementary properties of i and u are listed in the next proposition. 

Proposition 7.1 The mapping u: QS m — ► QS is a homomorphism of HR- algebras. 
The mapping i: QS — ► QS m is not a homomorphism, but it commutes with the 
operations of the form ©, srcfg a and srcren a ^f,. 
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i does not commutes with the operations of the form fus a: t, but if G E QS, 
then i(fus 0j6 (G)) = i(u(mfus a> b(i(G)))) . 

Finally, if G <E QS, then i(G) = u' 1 (G) n l ($ S ) and u ( l ( G )) = G - 

We now prove the following theorems, which describe the interaction between 
H R m -rccognizability of sets of multi-graphs and H R-recognizability of sets of 
simple graphs. 

Theorem 7.2 The set of simple graphs is HR m -recognizable. More precisely, 
for each finite set of source labels C, i{QS{C)) is HR m -recognizable. 

Theorem 7.3 Let C be a finite set of source labels and let L C QS{C). Then 
L is HR- recognizable if and only if i(L) is HR m -recognizable. 

Theorem 7.4 Let C be a finite set of source labels and let L C QS m {C). If L 
is Y\R m -recognizable, then u(L) is HR-recognizable. 



7.1 Proof of Theorem \77I\ 

We first introduce the notion of the type of a multi-graph: as for the elements 
of StS, if G G QS m (C), we let ((G) be the restriction of G to its C-sources and 
to the edges between them. We also denote by ( the relation on QS m induced 
by this type mapping: two multi-graphs G,H € QS m (C) are C-equivalent if 
((G) = ((H). 

Lemma 7.5 The type relation ( is an HR,„- congruence on QS m . Moreover, for 
each finite set of source labels C, the elements of i(QS(C)) can be found in only 
a finite number of (-classes. 

Proof. The result follows from the following, easily verifiable identities, where 
the multi-graphs G, H are assumed to have the appropriate sets of sources. 

((G®H) = ((G)® ((H) 

C(srcren a _>b(G)) = srcren a ^ 6 (C(G)) 

C(mfus a>6 (G)) = mfus a , b (C(G)) 

C(srcfg a (G)) - C(srcfg Q (C(G))). 

The finiteness of the number of ^-classes containing elements of i(QS(C)) follows 
from the fact that there are only finitely many source-only simple graphs with 
sources in G. □ 



We also introduce the following finite invariant for a simple graph G € 
QS(C). We define ij(G) to be the set of all pairs {a, b} of elements of G such 
that a b, ac be and there exists a vertex x of G with either edges from x 
to both aa and be, or edges to x from both aa and be- The set n(G) can be 
viewed as a symmetric anti-reflexive relation on G. 
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Lemma 7.6 Let G be a simple graph in QS(C) and let a ^ b be elements of C . 
Then mfus ai t>(G) has multiple edges if and only if {a, b} E 77(G) or mfus ai b(£(G)) 
has multiple edges. 

Proof. We first observe that mfus ai b(G) has multiple edges if and only if ac ^ 
be and at least one of the following situations occurs: there are edges in both 
directions between ac and be, or there is a vertex x of G with edges from (resp. 
to) both ac and be (this includes the case where there is a loop at ac or bo 
and an edge in either direction between ag and ba)- That is. mfus a .b(G) has 
multiple edges if and only {a, b} E 77(G) or there are edges in both directions 
between ac and be- 

We also observe that mfus ai b(C(G)) is a subgraph of mfus Qi b(G), so the former 
is simple if the latter is. Finally, the existence of edges in both directions between 
ac and be is sufficient to ensure that mfus , &(C(G)) has multiple edges. 

These observations put together suffice to prove the lemma. □ 

We are now ready to prove Theorem 17.21 Let ~ be the following relation, 
defined on each QS m {C). We let G ~ G' if both G and G' have multiple edges, 
or both G and G' are simple graphs, C(G) = C(G') and 77(G) = 77(C). 

It is immediate that ~ is an equivalence relation, saturating i{QS{C)). It 
follows from Lemma [7.51 and from the fact that 77(G) is a subset of the finite 
set G x C, that ~ is locally finite. So we only need to show that ~ is an 
H R rn -congruence. 

We need to describe the interaction between the mapping 77 and the HR m - 
operations. As observed in Proposition l7.ll all H Reoperations preserve simple 
graphs except for the operations of the form mfus 0i b. Assuming that G,H are 
simple graphs with the appropriate sets of sources, we easily verify the following: 

r)(G®H) = 77(G) U 77(F) 
77(srcfg a (G)) = 77(G) \ {{a, b} \ b E G, {a, b} E 77(G)} 
77(srcren a ^ 6 (G)) = 77(G) \ {{a, c} | c E C, {a, c} E 77(G)} 

U{{b,c}\c€C, {a,c}E 77(G)} 

Moreover, if ac ^ be and mfus ai f,(G) is simple (if it isn't, its 77-image is not 
defined), then r7(mfus a ,&(G)) consists of: 

(1) all pairs in 77(G), 

(2) all pairs {c, d} such that there are edges in C(G) from a to c and from b 
to d, or from c to a and from d to b, 

(3) all pairs {a, c} (resp. {b, c}) such that {b 7 c} E 77(G) (resp. {a, c} E 
r)(G)), 

(4) all pairs {a, c} and {6, c} such that there are edges in C(G) between a 
and 6 (in either direction) and between a or & and c (in any direction). 

Let us justify this statement: it is easy to see that all these pairs belong 
to ?7(mfus ai E,(G)). In particular, 77(G) C 77(mfus a ,&(G)) since, as mfus ai ;,(G) is 
assumed to be simple, there is no {c, d} E 77(G) such that ac = cg and ba = do- 
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Conversely, let us consider distinct edges in G' = mfus ai h(G), from y to x 
and from z to x, as in Figure 0] (note that x and y may be equal), such that 
y = &G* and z = fc for e, / £ C. If neither x, nor y nor z is the a- and 6-source 




Figure 4: Distinct edges in mfus ai b(G) 



in G', then we are in case (1), i.e., {e, /} S 77(G). If x is the a- and 6-source in 
G' but neither y nor z is, then {e, /} satisfies case (1) or (2). If y is the a- and 
6-source in G' but neither x nor z is, then {e, /} satisfies case (3). The same 
holds by symmetry if z is the only one of these three vertices to be the a- and 
6-sourcc in G'. Finally if x = y (rcsp. x = z) and is the a- and &-source,in G' 
then there is an edge between the a- and the ^-source in G and {e, /} satisfies 
case (4). The case of edges from x to y and to z is symmetrical. 

In particular, 7/(G © H). i](srcfg a (G)), r/(srcren a ^f,(G)) and ?7(mfus ai b(G)) 
are entirely determined by 77(G), ({G) and 17(H) . 

Let us now consider G, G' , H, H' in QS m (with the appropriate sets of 
sources) such that G ~ G' and i7 ~ If G is not simple, then neither 

arc G', G © H, srcfg a (G), srcren a ^b(G) and mfus ai b(G). In particular, we have 
G®H~G'®H' 1 srcfg a (G) ~ srcfg a (G'), srcreria^^G) ~ srcren a ^ b (G') and 
mfus aj6 (G) ~ mfus a: b(G')- 

Assume now that G and H are simple. Then so are G © if, srcfg a (G) and 
srcren a ^b(G), and we have seen that their ry-images are determined by r)(G) and 
rj(H). Since £ is an H R m -congruence fLcmma l7.5f) . it follows that ~ is preserved 
by the operations ffi, srcfg a and srcreria^f,. 

By Lemma 17.61 whether mfus aj f,(G) is simple, is determined by C(G) and 
77(G), and hence mfus aj f,(G) and mfus ai b(G') are both non-simple (and then ~- 
cquivalcnt) or both simple. In the latter case, their 77-imagcs are equal since they 
are both determined by 77(G) = 7^(G') and C(G) = C(G')- Thus ~ is preserved 
by the operation mfus a .t. This concludes the proof of Theorem l7.2l 

7.2 Proof of Theorem 1731 

Recall that we want to show that for each L 6 QS(C). L is H R- recognizable if 
and only if i(L) is HR m -recognizable. 

One direction is quickly established: we know from Proposition 17.11 that 
i(L) = M _1 (£) n i(QS(C)). If L is HR-recognizable, then is HRm- 

recognizable since it is a homomorphism. In view of Theorem 17.21 it follows 
that i(L) is H R m -recognizable as well. 
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Conversely, let us assume that i(L) is H R m -recognizable and let e be a 
locally finite HR m -congruence on QS m saturating i(L). We want to define a 
locally finite H R-congruence ~ on QS saturating L. 

For each symmetric anti-rcflexivc relation A on a finite set of source labels 
D and for each graph G G QS(D) 1 let deU(G) G QS(D) be the graph obtained 
from G by deleting the edges between the a-source and the 6-source for each 
pair {a, b] in D. Let also tusa be the composition of the operations fus a .t for 
all {a, b} G D, in any order. 

For G,G' G QS{D), we let G ~ G' if i(G) = t(G'), ((G) = C(G') and, for 
each symmetric anti-reflexive relation A on D, 

ifusAcleU(G) = ifus / idel J 4(G'). 

The relation ~ is clearly an equivalence relation, and it is locally finite since 
ee and £ are. Moreover, it saturates L since G G L if and only if i(G) G i(L), 
and ee saturates i(L). The rest of the proof consists in showing that ~ is an 
H R-congruence. 

The source renaming operation Let G ~ G' in QS(D). Then i(G) = i(G'). 
Since ee is a congruence and in view of Proposition 17.11 i(srcren a ^h(G)) = 
srcren Q ^b(z(G)) ee srcren a ^/,(i(G')) = i(srcren Q ^b(G')). It also follows from 
Lemma T3.9l that ^(srcren a _ > ; ) (G)) = C(srcren a ^;,(G')). 

Let us now consider a symmetric anti-reflexive relation A on the set of source 
labels of srcren a ^£,(G). It is easily verified that 

deUsrcren a ^b = srcreria^dels, 

where B = {{c,d} G A | {c, d} n {a, b} = 0} U {{a,d} | {b, d} G A}. We also 
note that if c, d £ G \ {a, 6}, then fus Ci d and srcren a ^h commute. Moreover 
fust.dsrcreria^f, = srcren a ^5fus a c ; and fus c .tsrcren a ^b = srcren a ^hfus c . a . Thus 
fusAsrcreria^f, = srcren a ^5fuss. 

Now, using the fact that i commutes with srcren a ^b we have 

ifusAdeUsrcren a ^b(G) = zfusAsrcren a _ > (,dels(G) 
= isrcren a ^f,fusBdels(G) 
= srcren a ^tifussdels(G). 

Since ee is an HR rn -congruence, it follows that 

ifusAdeUsrcren a _b(G) ee ifus J 4deUsrcren a _b(G') 

and, finally, that srcren a ^b(G) ~ srcren a ^b(G'). 

The source forgetting operation The proof is the same as for the source re- 
naming operation, with this simplifying circumstance that del/isrcfg a = srcfg a del J 4 
and fus J 4srcfg a = srcfg^usA (since a is not a source label of srcfg a (G), and hence 
does not occur in A). 
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The source fusion operation Let G ~ G' in QS{D). Here it is not immedi- 
ate that i(fus a ,b(G)) = i(fus a> b(G')). However, if we let A = {{a, b}}, we know 
that 

ifus J 4deU(G) = ifus J 4deU(G'). 

We note that fus^del^G) is equal to fus 0i t,(G) if G has no edge between its 
a- or 6-source, or if it has a loop at either. Otherwise, fus aj b(G) is equal to 
fus^del^G) with a loop added to its a-source, that is: 

fus a , b (G) = srcfg Q srcfg /3 fus aiQ fus^ )3 (fus J 4deU(G) © E) (*) 

where a and (3 are source labels not in D and E is the graph in QS({a, f3}) with 
2 vertices and a single edge from its a-source to its /3-sourcc. 

Observe also that the existence of loops at, or edges between the a- and 
6-source of G is a condition that depends only on C(G), so it will be satisfied by 
both G and G' or by neither. 

In the first case, where fusAdeU(G) = fus aj h(G), we find immediately that 
i(fus a ,b(G)) = i(fuSa,b(G')). In the second case, the same =-equivalence is de- 
rived from Proposition 17 . II and Equation (*) above. 

By Lemma 13.91 (^-equivalence is preserved by the operation fus a ,;,. 

Now let A be a symmetric anti- reflexive relation on D: we consider the graph 
ifus A de\ A fus a ^(G). Our first observation is that de\ A fus a b = fus a ^del^ where 

B = A U {{a, c} | {o, c} e A} U {{b, c} | {a, c} e A}. 

Next, we observe that fus^fusa^ = fus ai bfuss. Thus we have 

zfusAdeUfus a ,b(G) = ifus^fusa^dels = ifus a ,6del B fusB(G), 

and hence ifuSyideUfus a b(G) = ifuSyideUfus a &(G'). It follows that fus a b(G) ~ 
fus a>b (G'). 

The disjoint union operation Let G ~ G' in QS(C) and H ~ iJ' in QS(D) 
(where G and D are disjoint). Since i and £ preserve ©, we have i(G © if) = 
© H') and C(G © H) = C(G' © ff')- 
Now let A be a symmetric anti- reflexive relation on CUD. Let Q (resp. R) 
be the restriction of A to G (resp. £)) and let P = A n ((G x £>) U (L> x G)). It 
is easily verified that 

de\ A (G®H) = de\ Q (G)®de\ R {H) 
fus A de\ A (G ® H) = fus P (fusQdel Q (G) ®fus R de\ R (H)). 

It now follows from Proposition ^. II that 

ifusAdeU(G © H) = tfusp(fus Q de\ Q (G) ®fus R de\ R (H)) 

— iumfuspi(fusQde\Q(G) © fus R de\ R (H)) 
= zumfus_p^fusQdelQ(G) © ifus R de\ R (H)). 

Thus ifus A deU(G © H) = ifus A de\ A (G' © H'), and hence G © H ~ G' © iT '. 
This concludes the proof of Theorem l7.3l 
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7.3 Proof of Theorem Ol 



Let L G QS m (C) be HR m -recognizable, and let e be a locally finite HR m - 
congruence saturating L. We want to show that u(L) (a subset of QS{C)) is 
H R-recognizable. 

Let G,G' e QS{D). We let G ~ G' if, for each H G it _1 (G), there exists 
H' G u~ 1 {G') such that H = H', and symmetrically, for each H' G u _1 (G'), 
there exists H G u _1 (G) such that H = H' . 

The relation ~ is easily seen to be a locally finite equivalence relation on 
QS 1 saturating u(L). There remains to see that ~ is an H R-congruence. 

We first establish the following lemma. 

Lemma 7.7 Let G G QS m and let H,K G GS. 

• u(G) = H © K if and only if there exist multi-graphs H',K' such that 
G = W © K', u(H') = H and u(K') = K. 

• u(G) = srcfg a (i?) if and only if there exists a multi-graph H' such that 
G = srcfg a (H') and u(H') = H. 

• u(G) = srcren a ^f,(_ff) if and only if there exists a multi-graph H' such that 
G = srcren Q ^b(i?') and u(H') = H . 

• u{G) = fus ai b(-ff) if and only if there exists a multi-graph H' such that 
G = mfus a<b (H') and u{H') = H. 

Proof. Recall that G and u(G) have the same set of vertices, and each edge 
e of u(G) arises from the identification n(e) > 1 edges of G between the same 
vertices. 

If u(G) = H © K, each edge of u(G) is in exactly one of H and K. Let H' 
(resp. K') be the graph obtained from H (resp. K) by replacing each edge e 
by n(e) parallel edges. Then G = H' © K', u{H') = H and u(K') = K, as 
required. 

The proof of the statements relative to the operations srcfg a and srcren a ^b 
is done in the same fashion. 

Let us finally consider the case where u(G) = fus a .h(_ff). If ajj = bjj, that 
is, H = u(G), then G = mfus ai t,(G) and we can let H' = G. 

If ajj ^ iff, we let H 1 be obtained from H be obtained from H as follows: 
for each vertex x, each edge e from x to y {y ^ a, b) is replaced by n(e) parallel 
edges, and the edges from x to a and b are duplicated to a total of n(e) edges. □ 

Wc can now conclude the proof of Thcorcm l7.4l by proving that ~ is an HR- 
congruence. Let G ~ G' and H ~ H'. Let K G u~ 1 (G © H). By Lemma 1771 
K = L®M for some L G u _1 (G) and M G u' 1 ^). Since G - G' and H - H', 
there exist V G u _1 (G') and M' G such that V = L and M' = M. 

Let K' = L' © M'. Then K' = L' ® M' = L ® M = K and A'' G u" x (G' © J3"'). 
By symmetry, this shows that G © H ~ G © iJ' '. 

The verification that ~ is preserved by the other H R-operations proceeds 
along the same lines. This concludes the proof of Theorcm l7.4l 
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8 Graph algebras based on graph substitutions 



The class Graph, defined in Section I5TTI has already been discussed in terms of 
the signatures S, VR and HR since it is a domain in each of the three algebras 
StS, QV and QS. In this section, we consider a different set of operations on 
Graph, arising from the theory of the modular decomposition of graphs, which 
makes Graph an algebra (one-sorted for a change!). This algebraic framework 
was considered by the authors, in and |46|. 

We first recall the definition of the composition operation on graphs. Let H 
be a graph with vertex set [n] = {1, . . . ,n} (n > 2). If G\, . . . , G n are graphs, 
then the composite H{G\, . . . , G n ) is obtained by taking the disjoint union of 
the graphs G\, . . . , G n , and by adding, for each edge {i, j) of H where i ^ j, an 
edge from every vertex of Gi to every vertex of Gj . 

We say that a graph is indecomposable, or prime, if it cannot be written 
non-trivially as a composition (a composition is trivial if each of its arguments 
is a singleton). It is easily verified that if H and H' are isomorphic graphs, 
then the corresponding composition operations yield isomorphic graphs. So we 
fix a set Too of representatives of the isomorphism classes of indecomposable 
graphs. In particular, we may assume that every graph in Too has a vertex set 
of the form [n] for some n > 2. We also denote by Too the resulting modular 
signature, consisting of the composition operations defined by these graphs. The 
-^oo-algebra of graphs is denoted by Graph^ 00 . 

It turns out that every finite graph admits a modular decomposition, that 
is, it can be expressed from the single-vertex graph using only operations from 
Too ■ This fact has been rediscovered a number of times in the context of graph 
theory and of other fields using graph-theoretic representations. We refer to 
for a historical survey, and to |35| for a concise presentation. In other words, 
Graph is generated by the signature Too augmented with the constants v loop and 
v, which denote a single vertex graph, respectively with and without a single 
loop edge. 

Remark 8.1 The modular decomposition of a graph is unique up to certain 
simple (cquational) rules, sec for instance |46| . Moreover, the modular decom- 
position of a graph can be computed in linear time |^ 1^ . □ 

Our first results connect VR-recognizability and .Foo-recognizability. 

Proposition 8.2 Every VR-recognizable set of graphs is Too-recognizable. 

Proof. In view of Proposition 12.11 and Theorem 14.51 it suffices to show that 
every operation in Too is VR + -derived. 

For each integer i, let marki be the unary operation on QV, of type — > {i}, 
defined as follows: given a graph without ports, it simply marks every vertex 
with port label i (leaving the set of vertices and the edge relation unchanged). 
Note that markj is a qfd unary operation, and hence a VR + -opcration. 
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Let H be an n-ary operation, that is, a graph in with vertex set [n], and 
let edge H be its edge relation. If Gi, . . . , G„ are finite graphs, the construction 
of H(Gi, . . . , G„) can be described as follows: 

- construct the disjoint union. marki(Gi) © • • ■ © mark„(G n ), an element of 
SV([n]); 

- apply (in any order) to this disjoint union the operations add;.j for all 
i,j G [n] such that is an edge of H and i ^ j; 

- forget all ports, that is, apply the operation mdf0. 

This completes the verification that the operation defined by H can be expressed 
as a VR + -term, and hence the proof. □ 

The following result shows that the converse of Proposition 18.21 docs not 
hold. 

Proposition 8.3 Every set of prime graphs is J- oo -recognizable, and there is a 
set of prime graphs which is not VR-recognizable. 

Proof. Let L be a set of prime graphs, and let = be the relation on Graph 
defined as follows. We let G = H if one of the following holds: 

• neither G nor H is prime; 

• G and H are both 1 (the graph with one vertex and no edge); 

• G and H are both not 1, prime and in L; 

• G and H are both not 1, prime and not in L. 

This is clearly an equivalence relation with four classes, which saturates L. 
Moreover, = is an .F^-congruence. Indeed, let if be a graph with n vertices; 
for i = 1, . . . , n, let Gi = Hi for each i. If for some i, Gi ^ 1, then Hi ^ 1, and 
neither K{G\ 1 G„) nor K{H\, H n ) is prime: therefore they are equivalent. 
Otherwise, Gi = H L = 1 for each i, K{G\, ...,G„) and K{Hi, ...,H n ) are both 
equal to K , and hence they arc equivalent. This concludes the proof that every 
set of prime graphs is Foe-recognizable. 

Before we exhibit a set of prime graphs which is not VR-recognizable, we 
define inductively a sequence of VR-tcrms written with three port labels a, 6, c. 
We let 

t = add aib (a © 6), t n+1 = ren c ^ b (ren b ^ a (add byC (t n © c))). 

The term mdf^ ^(tn) (forgetting all port labels in t n ) denotes the string graph 
P n +2, with n + 2 vertices, say 1, . . . , n + 2 and edges from i to i + 1 for each 
1 < i < n+ 1. Each of these graphs is prime. 

Now let A be a set of positive integers that is not recognizable in (N, succ, 0} 
and let L be the set of all terms P n with n £ A. From the above discussion, we 
know that L is Foe-recognizable. If L was VR-recognizable, standard arguments 
would show that the set of VR-terms t n (n £ A) would be recognizable as well, 
and it would follow that A is recognizable, contradicting its choice. □ 
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Now let T be a finite subsignature of the modular signature . A graph 
which can be constructed from one-vertex graphs using only operations from 
T is called an .F-graph. The next result deals with sets of ^-graphs. This 
finitcness condition (the elements of L are built by repeated composition of 
a finite number of graph-based operations) is non-trivial. In fact, for many 
natural classes of graphs such as rectangular grids, it is not satisfied: since grids 
are indecomposable, a set of graphs containing infinitely many grids cannot 
satisfy our finitcness condition. But that condition is satisfied by other classical 
classes (e.g. cographs, series- parallel posets), see [T^lUB] . 

Using results of Courcelle we can show the following result, which yields 
in particular a weak converse of Proposition 18. 21 

Theorem 8.4 Let J- be a finite subsignature of and let L be a set of T- 
graphs. The following properties are equivalent: 

1. L is S -recognizable; 

2. L is VR-recognizable. 

3. L is Too -recognizable. 
4-. L is T -recognizable. 

Proof. The equivalence of (1) and (2) can be found in Theorem 14.51 Propo- 
sition |^21 shows that (2) implies (3). And (3) implies (4) as an immediate 
consequence of Proposition 12 . 1 1 since T is a subsignature of Too- The fact that 
(4) implies (1) is a consequence of two results of Courcelle: ^3 Theorem 4.1], 
which states that if a set of JF-graphs is JF-recognizable, then it is definable in 
a certain extension of MS- logic; and Theorem 6.11], which states that all 
sets definable in this logical language are 5-recognizable. □ 

Remark 8.5 Thcorem l8.4l states that for sets of graphs with only finitely many 
prime subgraphs, all four notions of recognizability arc equivalent. Presented in 
this fashion, the statement is somewhat similar to that of Theorem 16. II □ 

9 Conclusion 

In this article, we have investigated the recognizability of sets of graphs quite 
in detail, focusing on the robustness of the notion, which was not immediate 
since many signatures on graphs can be defined. Although we had in mind sets 
of graphs, we have proved that embedding graphs in the more general class of 
relational structures does not alter recognizability. We have proved that the 
very same structural conditions that equate VR-cquational and H R-cquational 
sets of graphs, also equates H R-recognizability and VR-rccognizability. 

Summing up, we have defined a number of tools for handling recognizability. 
Some questions remain to investigate. 
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• When is it true that a quantifier-free operation preserves recognizability? 

Results in this direction have been established in Courcelle (TO]. Are they 
applicable to quantifier-free definable operations? In particular, is it true that 
the set of disjoint unions of two graphs, one from each of two VR-recognizable 
sets is VR-recognizable ? 

• Which quantifier-free definable operations can be added to the signature HR, 
in such a way that the class of HR-recognizable sets is preserved (as is the case 
when we extend VR to VR+)? The paper by Blumensath and Courcelle 
which continues the present research, considers unary non qfd operations that 
can be added to VR + and to StS while preserving the classes of equational and 
recognizable sets. 

• Our example of an HR-recognizable, not VR-recognizable set of cliques, is 
based on the weakness of the parallel composition of graphs with sources, i.e., 
the fact that this operation is not able to split large cliques. Can one find 
another example, based on a different argument? If one cannot, what does this 
mean? 

We conclude with an observation concerning the finiteness of signatures. 
Whereas all finite words on a finite alphabet can be generated by this alphabet 
and only one operation, dealing with finite graphs (by means of grammars, 
automata and related tools) requires infinite signatures. More precisely, one 
needs infinitely many operations to generate all finite unlabclled graphs (see 
Remark 19 . 1 1 below) . On the other hand, applications to testing graph properties 
require the consideration of algebras generated by a finite signature. Here is the 
reason. 

Let M be an JF-algebra of graphs. If the unique valuation homomorphism 
valM'T(J-) — > M (which evaluates a term into an element of M) is surjective, 
i.e., if T generates M, then a subset L of M is recognizable if and only if 
val^fiL) is a recognizable set of terms (see Proposition 12.11 and Section |2~S|) . 
And the membership of a term in a recognizable set can be verified in linear 
time by a finite deterministic (tree) automaton. Hence the membership of a 
graph G in L can be checked as follows: 

(1) One must first find some term t such that validity = ^> 

(2) then one checks whether t belongs to valTf(L). 

The latter step can be done in time proportional to the size of t, usually 
no larger than the number of vertices of G. Although any term t with value G 
gives the correct answer, it may be difficult to find at least one (graph parsing 
problems may be N P-complete) . 

Because of this fact many hard problems (in particular if they are expressed 
in Monadic Second-order logic) can be solved in linear time on sets of graphs 
of bounded tree- width, and also on sets of graphs of bounded clique- width, 
provided the graphs are given with appropriate decompositions, see Courcelle 
[T5] , Courcelle and Olariu ^5] or Downey and Fellows [23 ■ If the decompositions 
are not given, one can achieve linear time for graphs of bounded tree- width and 
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M5*2 problems using a result by Bodlaender 0] , and polynomial time for graphs 
of bounded clique- width and MS± problems using a result by Oum and Seymour 

However, even if J- is infinite or is finite without generating the set M, recog- 
nizability remains interesting as an algebraic concept, and for every restriction 
to a finitely generated subset of M, we are back to the "good" case of a finitely 
generated algebra. 

Finally, we think that infinite signatures can be used for checking graph 
properties defining recognizable sets. This will not be possible by finite tree- 
automata if the graph algebra is not finitely generated, but it can perhaps be 
done with automata using "oracles". An oracle would be a subroutine han- 
dling some verifications for big subgraphs that cannot be decomposed by the 
operations under consideration. This idea needs of course further elaboration. 

Remark 9.1 We asserted above that finite unlabclled graphs cannot be gen- 
erated with a finite signature. This is not entirely correct, and we briefly de- 
scribe here a signature with 6 operations on a 2-sorted algebra which generates, 
somewhat artificially, all finite graphs (undirected and without loops). These 
operations have no good behaviour with respect to automata and verification 
questions, and such an "economical" generation of graphs is useless. 

The 2 sorts are o, the set of finite graphs equipped with a linear order of their 
vertex set, and u, the set of ordinary, unordered graphs. There is one unary 
operation of type o — > u, which forgets the order on the vertex set. All other 
operations are unary, of type o — ► o: one consists in adding one new vertex, 
to be the new least element; one adds an (undirected) edge between the two 
least vertices; one performs a circular shift of the vertices; and one swaps the 
two least vertices. The three last operations leave the graph unchanged if it has 
less than 2 vertices. Finally, one adds a 6th, miliary operation, of type o: the 
constant 0, standing for the empty graph with no vertices. □ 

A Equivalences of logical formulas 

In this appendix, we discuss some equivalences and transformations of logical 
formulas which can be used to give upper bounds for the index of congruences 
considered in this paper, and to complete the proof of the effectiveness of certain 
notions (e.g. quantifier-free definition schemes). 

More specifically, we make precise in what sense we can state, as we do in 
the body of the paper, that the set of first-order (rcsp. monadic second-order) 
formulas over finite sets of relations, constants and free variables, and with a 
bounded quantification depth, can be considered as finite. Moreover, explicit 
upper bounds on the size of these finite sets are derived, which can be used 
to justify the termination of some of our algorithms, and in evaluating their 
complexity That these upper bounds have unbounded levels of exponentiation 
is not unexpected, and even unavoidable by Frick and Grohe |26j . 
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A.l Boolean formulas 



Let p\ , . . . , p n be Boolean variables and let B n be the set of Boolean formulas 
written with these variables. It is well known that B n is finite up to logical 
equivalence. For further reference, we record the following more precise state- 
ment. 

Proposition A.l There exists a subset B^ ed of B„, of cardinality 2 2 such that 
every formula in B n can be effectively transformed into an equivalent formula 
in B r n ed . 

Proof. We let B^ d be the set of Boolean formulas in disjunctive normal form, 
where in each disjunct, variables occur at most once and in increasing order, 
no two disjuncts are equal, and disjuncts are ordered lexicographically. These 
constraints guarantee the announced cardinality of B^ ed ; the rest of the proof 
is classical. □ 

Of course, the formula in B T ^ d equivalent to a given formula, is not always 
the shortest possible. 

A. 2 First-order formulas, semantic equivalence 

Let us consider finite sets R and C, of relational symbols and of constants 
(miliary relations, source labels) as in Section l5"Tl Recall that, if X is a finite set, 
FO{R 1 C, X) denotes the set of first-order formulas in the language of (R, C)- 
structures, with free variables in X. For unproved results in this section, wc 
refer the reader to 

Several notions of semantic equivalence of formulas can be defined. If </?, ip S 
FO(R,C, X), say that ip = ip if for every (R, C)-structure S and for every 
assignment of values in S to the elements of X, ip and ip are both true or both 
false. Say also that tp = u ip if the same holds for every finite or countable 
(i?, C)-structure S, and ip =f ip if S is restricted to being finite. 

The equivalences = and = u coincide by the Lowenheim-Skolem theorem. 
Indeed this theorem states that if a closed formula has an infinite model, then 
it has one of each infinite cardinality: to prove our claim, it suffices to apply it 
to the formula 3a: -i(<^(x) <^ ip{x)). We note that this equivalence cannot be 
extended to monadic second-order formulas: there exists an MS formula with a 
unique model, isomorphic to the set of integers N with its order. 

Each of these three equivalences is known to be undecidablc. 

The equivalence = (or = w since we consider only first-order formulas) is scmi- 
decidable: by GodePs completeness theorem, ip = ip if and only if the formula 
VaT (ip{x) O ip{x)) has a proof, which is a recursively enumerable property. 

Trakhtenbrot proved that one cannot decide whether a first-order formula is 
true in every finite structure, thus proving that =/ is not decidable. However, 
the negation of =f is semi-decidable: if ip ip, a counter-example can be 
produced by exploring systematically all finite (R, C)-structures. This is a proof 
also that = and =/ do not coincide. 
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A. 3 First-order formulas, a syntactic equivalence 

We now describe a syntactic equivalence w on formulas, which refines the se- 
mantic equivalences = and =f \ that is, if (p w ip, then ip = ip and ip =f ip. 

If b 6 B n , and if ipi, . . . , ip n € FO(R, C, X), we denote by &(^i, . . . , tp n ) 
the formula in FO(R,C,X) obtained by replacing each occurrence of pi in 
b by ifi. It is clear that if b and 6' are equivalent Boolean formulas, then 
b((pi, ■ ■ ■ , <p n ) =b'(tpi,...,(p n )- 

A Boolean transformation step consists in replacing in a first-order formula, 
a sub-formula of the form b(tpi, . . . , tp n ) by the equivalent formula b'(tpi, . . . , tp n ), 
where b, b' € B n are equivalent. Then we let ip ps ip if ip can be transformed 
into ip by a sequence of Boolean transformation steps and of renaming of bound 
variables. 

It is clear that if ip « ip, then ip = ip. We want to show that each first- 
order formula is effectively equivalent to an ^-equivalent formula of the same 
quantifier height, and to give an upper bound on the number of ^-equivalence 
classes of formulas of a given height. 

A. 3.1 Quantifier- free formulas 

Let QF(R,C,X) be the set of quantifier-free formulas in FO(R,C,X). Such 
formulas arc Boolean combinations of atomic formulas. Let Atom(R,C, X) be 
the set of these atomic formulas. Note that each atomic formula is either of the 
form x = y, where x and y are in X U C, or r(x\ , . . . , x p t r \) where r is a p(r)-ary 
relation in R and the Xi are in X U C . Letting n = card(A) and c = card(C), it 
is easily verified that 

card(Atom(R,C,X)) = (n + cf + ^(n + c)'^ . 

r6fl 

We let f(R,c,n) be this function. Note that if we allow for the (effective) 
syntactic simplifications of identifying the formulas of the form x = x with the 
constant true, and of identifying the formulas x = y and y = x, we can lower 
the value of f(R,c,n) to 1 + \{n + c)(n + c - 1) + £ refl (n + c)^ r ). 
We then have the following. 

Proposition A.2 There exists a subset QF red (R,C,X) of QF(R,C,X), of 
cardinality 2 2/< , such that every formula in QF(R,C, X) can be effectively 
transformed to an ^-equivalent formula in QF red (R,C,X). 

Proof. By definition of quantifier-free formulas, QF(R, C, X) is the set of all 
formulas of the form b(<p±, . . . ,tp n ), where b is a Boolean formula and the <pi are 
atomic formulas. Now let QF red (R, C, X) be the set of all formulas of the form 
b(tpi, . . . ,cp n ), where 6 £ B^ d and the <p>i are pairwise distinct atomic formulas. 
The proof of the precise statement is now immediate, using Proposition lA. 1 j □ 
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Example A. 3 Let us consider graphs with sources, so that R consists of a 
single, binary edge relation. Then f{R,c,0) = 2c 2 and card(Q F red (R, C, 0)) = 

2 22c = q(c). Thus the type equivalence ( (see Section 0751 and Lemma ET%1) has 
at most 2 q( - c) classes in GS{C). □ 

Remark A. 4 Again, we are not claiming that the set QF red (R,C,X) is as 
small as possible. On quantifier-free formulas, the equivalence = is dccidable, 
because (p = ip is false if and only if the closed formula 3x(<p(x) ^> ip(x)) 
is satisfiable, and the satisfiability problem for existential formulas in prcnex 
normal form is decidable (see .6 ) . Thus one can modify Proposition IA.2I by 
letting QF red (R, C, X) be the set of lexicographically minimal formulas in each 
=-class: the same statement of Proposition ! A. 21 would then hold with = instead 
of «. In particular, the transformation would still be effective, although very 
inefficient. It is not clear whether the cardinality of the new set of reduced 
quantifier-free formulas would be significantly smaller. □ 



A. 3. 2 Quantifier depth of first-order formulas 

Recall that the quantifier depth of a first-order formula is the maximal num- 
ber of nested quantifiers. If we let FOk{R,C, X) be the set of formulas in 
FO(R,C, X) of quantifier depth at most k, a formal definition is as follows: 
FO {R,C,X) = QF{R,C,X) and, for each k > 0, FO k+1 (R,C,X) is the set 
of Boolean combinations of formulas in 

FO k (R,C,X) = FO k (R,C,X) 

U {Bytp\<peFO k (R,C,XU{y})} 
U {Vy <p\ ifGFO k (R,C,XU{y})}. 

Using the same recursion, let us define sets of "reduced" formulas of every 
quantifier depth. First we fix an enumeration of the countable set of variables. 
Next, we let FO^ ed {R,C,X) = QF red (R,C,X). For each k > 0, we then let 
FO r k ed x (R, C, X) be the set of formulas of the form b(ipi ,...,(p n ) where b £ B' n ed 
and the tp^s are in 

FdT d (R,C,X) = FO r k ed (R,C,X) 

U {3y tp | Lp G FO r k ed {R, C,XU {y}), y minimal not in X} 
U {Vytp\<p£ FO[ ed (R, C,XU {y}), y minimal not in X}. 

Proposition A. 5 For each k > 0, the set FO k ed (R,C,X) is finite. Moreover, 
every formula in FO k {R, C, X) can be effectively transformed to an ^-equivalent 
formula in FOl ed (R, C, X). 

Proof. Let n = card(X) and c = card(C), let g(k,R,c, n) be the cardinality of 
FOl ed (R, C, X), and let h(k, R, c, n) be the cardinality of FO^ (R, C, X). It is 
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elementary to verify that these functions can be bounded as follows: 

g(0, R, c, n) < 2 f(R ' c ' n) and for k > 

g{k,R,c,n) < 2 2 

h(k, R, c, n) < 3g(k — 1, R, c, n + 1). 

The rest of the proof is immediate, from the recursive definitions. □ 

Remark A. 6 Since there is a procedure to transform each first-order formula 
into an ^-equivalent formula in "reduced form" , we can consider a new equiva- 
lence relation on first-order formulas: to yield the same reduced formula. This 
equivalence is decidable and it refines « (and hence =). □ 

Remark A. 7 In Proposition lA.5l we can still consider replacing each formula 
by the lexicographically least equivalent formula, but this method is not effec- 
tive, since the equivalence of first-order formulas is not decidable. □ 

A. 4 Monadic second-order formulas 

A very similar analysis can be conducted for monadic second-order formulas of 
bounded quantifier depth. One difference is that the Lowenheim-Skolem theo- 
rem does not hold for these formulas, so the semantic equivalence of formulas 
based on coincidence on all finite or countable models does not imply coinci- 
dence on all models. Moreover, since there is no complete proof systems for 
such formulas, the equivalences = and =oj are not semi-decidable. 

For the rest, one can follow the same techniques as above, to prove the 
following result. We denote by MSk(R, C, W) the set of monadic second-order 
formulas of quantification depth k in the language of (R, C)-structures, with 
their first- and second-order free variables in W. 

Proposition A. 8 For every finite R, C, W 1 k, one can construct a finite subset 
MS^ ed (R, C, W) ofMS k (R, C, W) such that, for every formula in MS k (R, C, W), 
one can construct effectively an =- equivalent formula in MSl, ed (R,C,W). 
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